Spaces:

pribadihcr
/

quGPT

Running

File size: 4,972 Bytes

cfd3735

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# SceneXplain\n",
    "\n",
    "\n",
    "[SceneXplain](https://scenex.jina.ai/) is an ImageCaptioning service accessible through the SceneXplain Tool.\n",
    "\n",
    "To use this tool, you'll need to make an account and fetch your API Token [from the website](https://scenex.jina.ai/api). Then you can instantiate the tool."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"SCENEX_API_KEY\"] = \"<YOUR_API_KEY>\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.agents import load_tools\n",
    "\n",
    "tools = load_tools([\"sceneXplain\"])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or directly instantiate the tool."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.tools import SceneXplainTool\n",
    "\n",
    "\n",
    "tool = SceneXplainTool()\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Usage in an Agent\n",
    "\n",
    "The tool can be used in any LangChain agent as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3m\n",
      "Thought: Do I need to use a tool? Yes\n",
      "Action: Image Explainer\n",
      "Action Input: https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png\u001b[0m\n",
      "Observation: \u001b[36;1m\u001b[1;3mIn a charmingly whimsical scene, a young girl is seen braving the rain alongside her furry companion, the lovable Totoro. The two are depicted standing on a bustling street corner, where they are sheltered from the rain by a bright yellow umbrella. The girl, dressed in a cheerful yellow frock, holds onto the umbrella with both hands while gazing up at Totoro with an expression of wonder and delight.\n",
      "\n",
      "Totoro, meanwhile, stands tall and proud beside his young friend, holding his own umbrella aloft to protect them both from the downpour. His furry body is rendered in rich shades of grey and white, while his large ears and wide eyes lend him an endearing charm.\n",
      "\n",
      "In the background of the scene, a street sign can be seen jutting out from the pavement amidst a flurry of raindrops. A sign with Chinese characters adorns its surface, adding to the sense of cultural diversity and intrigue. Despite the dreary weather, there is an undeniable sense of joy and camaraderie in this heartwarming image.\u001b[0m\n",
      "Thought:\u001b[32;1m\u001b[1;3m Do I need to use a tool? No\n",
      "AI: This image appears to be a still from the 1988 Japanese animated fantasy film My Neighbor Totoro. The film follows two young girls, Satsuki and Mei, as they explore the countryside and befriend the magical forest spirits, including the titular character Totoro.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n",
      "This image appears to be a still from the 1988 Japanese animated fantasy film My Neighbor Totoro. The film follows two young girls, Satsuki and Mei, as they explore the countryside and befriend the magical forest spirits, including the titular character Totoro.\n"
     ]
    }
   ],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain.agents import initialize_agent\n",
    "from langchain.memory import ConversationBufferMemory\n",
    "\n",
    "llm = OpenAI(temperature=0)\n",
    "memory = ConversationBufferMemory(memory_key=\"chat_history\")\n",
    "agent = initialize_agent(\n",
    "    tools, llm, memory=memory, agent=\"conversational-react-description\", verbose=True\n",
    ")\n",
    "output = agent.run(\n",
    "    input=(\n",
    "        \"What is in this image https://storage.googleapis.com/causal-diffusion.appspot.com/imagePrompts%2F0rw369i5h9t%2Foriginal.png. \"\n",
    "        \"Is it movie or a game? If it is a movie, what is the name of the movie?\"\n",
    "    )\n",
    ")\n",
    "\n",
    "print(output)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.2"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}