Spaces:

HarBat
/

ChatBOT

Runtime error

App Files Files Community

HarshaBattula commited on Jul 25, 2023

Commit

f401ee6

1 Parent(s): a544212

replaced gpt-3.5-turbo with LLaMA-2.0-chat

Browse files

Files changed (4) hide show

README.md +4 -4
chain.py +29 -29
credentials.json +1 -0
requirements.txt +2 -1

README.md CHANGED Viewed

@@ -10,17 +10,17 @@ pinned: false
 license: unknown
 ---
-# Document Retrieval Augmented Language Model with LangChain and OpenAI's GPT-3.5-turbo.
 ## Description
-This project involves the creation of a vector database using OpenAI embeddings and Chroma DB, followed by the retrieval of document snippets through a similarity search with LangChain's retrieval system. Upon retrieval of relevant snippets, the system uses GPT-3.5 Turbo to generate responses to input questions using the retrieved snippets as context. The system also incorporates a ConversationBufferMemory to store the memory of the chat, enhancing the quality of the conversational context and the relevance of generated responses.
 ## Contents
 1. **OpenAI Embeddings and Chroma DB**: Utilizes the rich semantic information in OpenAI embeddings and the efficient storage and retrieval capabilities of Chroma DB to create a performant and effective vector database.
 2. **Document Retrieval**: Uses LangChain's retrieval system to perform similarity search and retrieve relevant snippets from documents based on input queries.
-3. **Response Generation with GPT-3.5 Turbo**: Leverages the advanced language understanding and generation capabilities of GPT-3.5 Turbo to generate responses to input questions using Langchain's `RetrievalQA`.
 4. **ConversationBufferMemory**: Stores the history of the conversation to ensure context continuity and enhance the relevance of the responses generated.
 ## Getting Started
@@ -28,7 +28,7 @@ This project involves the creation of a vector database using OpenAI embeddings
 ### Prerequisites
 Before you begin, ensure you have met the following requirements:
 - You have installed Python 3.x.
-- You have access to OpenAI GPT-3.5 Turbo and relevant API credentials.
 - You have set up Chroma DB on your server/machine, and the documents in the database.
 - You have access to LangChain's retrieval system.

 license: unknown
 ---
+# Document Retrieval Augmented Language Model version 2.0 with LangChain and Meta's LLaMA-2.0 Chat.
 ## Description
+This project involves the creation of a vector database using OpenAI embeddings and Chroma DB, followed by the retrieval of document snippets through a similarity search with LangChain's retrieval system. Upon retrieval of relevant snippets, the system uses LLaMA-2.0 to generate responses to input questions using the retrieved snippets as context. The system also incorporates a ConversationBufferMemory to store the memory of the chat, enhancing the quality of the conversational context and the relevance of generated responses.
 ## Contents
 1. **OpenAI Embeddings and Chroma DB**: Utilizes the rich semantic information in OpenAI embeddings and the efficient storage and retrieval capabilities of Chroma DB to create a performant and effective vector database.
 2. **Document Retrieval**: Uses LangChain's retrieval system to perform similarity search and retrieve relevant snippets from documents based on input queries.
+3. **Response Generation with LLaMA-2.0**: Leverages the advanced language understanding and generation capabilities of LLaMA-2.0 to generate responses to input questions using Langchain's `RetrievalQA`.
 4. **ConversationBufferMemory**: Stores the history of the conversation to ensure context continuity and enhance the relevance of the responses generated.
 ## Getting Started
 ### Prerequisites
 Before you begin, ensure you have met the following requirements:
 - You have installed Python 3.x.
+- You have access to Meta's LLaMA-2.0 and relevant API credentials.
 - You have set up Chroma DB on your server/machine, and the documents in the database.
 - You have access to LangChain's retrieval system.

chain.py CHANGED Viewed

@@ -1,17 +1,21 @@
-from langchain.memory import ConversationBufferMemory
-from langchain import PromptTemplate
-from langchain.chat_models import ChatOpenAI
-from langchain.chains import RetrievalQA
 import openai
-from langchain import HuggingFacePipeline
-from transformers import AutoTokenizer
 import transformers
-import torch
 from huggingface_hub import login
-access_token_read = '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'
 login(token = access_token_read)
-openai.api_key = "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"
 def create_juniper_prompt_template():
     template =  """You are a network engineer from Juniper Networks not a Language Model, use your knowledge, and the some pieces of context (delimited by <ctx></ctx>) to answer the user's question. \n Try to pretend as if you are a member of Juniper Networks.  \nIf you don't know the answer, just say that you don't know, don't try to make up an answer.
@@ -41,7 +45,7 @@ def create_question_answering_chain(retriever):
     Create a retrieval question answering (QA) chain.
     This function initializes a QA chain that can be used to answer questions based on retrieved documents.
-    It uses the OpenAI 'gpt-3.5-turbo' model for the language model (LLM), and a document retriever for finding
     relevant documents.
     Args:
@@ -50,32 +54,29 @@ def create_question_answering_chain(retriever):
     Returns:
         qa_chain (obj): The initialized retrieval QA chain.
     """
-    # Initialize the OpenAI language model with specified temperature, model name, and API key.
-    model = "meta-llama/Llama-2-7b-chat-hf"
-    access_token = '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'
-    tokenizer = AutoTokenizer.from_pretrained(model, token=access_token)
     pipeline = transformers.pipeline(
-        "text-generation", #task
-        model=model,
-        tokenizer=tokenizer,
-        torch_dtype=torch.bfloat16,
-        trust_remote_code=True,
-        device_map="auto",
-        max_length=1000,
-        do_sample=True,
-        top_k=10,
-        num_return_sequences=1,
-        eos_token_id=tokenizer.eos_token_id,
     )
-    llm = HuggingFacePipeline(pipeline = pipeline, model_kwargs = {'temperature':0})
     # Initialize the retrieval QA chain with the language model, chain type, document retriever,
     # and a flag indicating whether to return source documents.
     qa_chain = RetrievalQA.from_chain_type(
-          llm=llm,
           chain_type='stuff',
           retriever=retriever,
           verbose=False,
@@ -88,5 +89,4 @@ def create_question_answering_chain(retriever):
           }
       )
     return qa_chain

+import json
+import torch
 import openai
 import transformers
+from transformers import AutoTokenizer
+from langchain.chains import RetrievalQA
 from huggingface_hub import login
+from langchain import HuggingFacePipeline
+from langchain.memory import ConversationBufferMemory
+from langchain import PromptTemplate
+with open("credentials.json", "r") as file:
+    access_token_read = json.load(file)["access_token_read"]
+    openai.api_key = json.load(file)["openai_api_key"]
 login(token = access_token_read)
 def create_juniper_prompt_template():
     template =  """You are a network engineer from Juniper Networks not a Language Model, use your knowledge, and the some pieces of context (delimited by <ctx></ctx>) to answer the user's question. \n Try to pretend as if you are a member of Juniper Networks.  \nIf you don't know the answer, just say that you don't know, don't try to make up an answer.
     Create a retrieval question answering (QA) chain.
     This function initializes a QA chain that can be used to answer questions based on retrieved documents.
+    It uses the Meta's 'LLaMA-2-chat' model for the language model (LLM), and a document retriever for finding
     relevant documents.
     Args:
     Returns:
         qa_chain (obj): The initialized retrieval QA chain.
     """
+    # Initialize the tokenizer and the language model.
+    tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", token=access_token_read)
     pipeline = transformers.pipeline(
+        "text-generation",
+        model = "meta-llama/Llama-2-7b-chat-hf",
+        tokenizer = tokenizer,
+        torch_dtype = torch.bfloat16,
+        trust_remote_code = True,
+        device_map = "auto",
+        max_length = 1000,
+        do_sample = True,
+        top_k = 10,
+        num_return_sequences = 1,
+        eos_token_id = tokenizer.eos_token_id,
     )
+    hf_llm = HuggingFacePipeline(pipeline = pipeline, model_kwargs = {'temperature':0})
     # Initialize the retrieval QA chain with the language model, chain type, document retriever,
     # and a flag indicating whether to return source documents.
     qa_chain = RetrievalQA.from_chain_type(
+          llm=hf_llm,
           chain_type='stuff',
           retriever=retriever,
           verbose=False,
           }
       )
     return qa_chain

credentials.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"access_token_read": "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", "openai_api_key": "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"}

requirements.txt CHANGED Viewed

@@ -10,4 +10,5 @@ langchain
 pypdf
 gradio
 einops
-bitsandbytes

 pypdf
 gradio
 einops
+bitsandbytes
+json