ArturG9 commited on
Commit
7072404
·
verified ·
1 Parent(s): 5ddcaaf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -10,4 +10,55 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
10
  license: apache-2.0
11
  ---
12
 
13
+ Chat with Lithuanian Law Documents
14
+ This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model.
15
+
16
+ Features
17
+
18
+ Users can choose the information retrieval type (similarity or maximum marginal relevance search).
19
+ Users can specify the number of documents to retrieve.
20
+ Users can ask questions about the provided documents.
21
+ The virtual assistant provides answers based on the retrieved documents and a powerful, yet environmentally friendly, large language model (LLM).
22
+ Technical Details
23
+
24
+ Sentence Similarity: The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
25
+ Local Vector Store: chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
26
+ RAG Chain with Quantized LLM: A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
27
+ Lightweight LLM: To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
28
+ Quantization: This Qwen2 model leverages a technique called quantization, which reduces the model size without sacrificing significant accuracy. This quantization process makes the model more efficient to run on local hardware, contributing to a more environmentally friendly solution.
29
+ CPU-based Processing: The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
30
+ Benefits of Compact Design
31
+
32
+ Local Processing: The compact size of the LLM and the application itself enable local processing on your device, reducing reliance on cloud-based resources and associated environmental impact.
33
+ Mobile Potential: Due to its small footprint, this application has the potential to be adapted for mobile devices, bringing legal information access to a wider audience.
34
+ Adaptability of Qwen2 0.5B
35
+
36
+ Fine-tuning: While the Qwen2 0.5B model is powerful for its size, it can be further enhanced through fine-tuning on specific legal datasets or domains, potentially improving its understanding of Lithuanian legal terminology and nuances.
37
+ Conversation Style: Depending on user needs and desired conversation style, alternative pre-trained models could be explored, potentially offering a trade-off between model size and specific capabilities.
38
+ Requirements
39
+
40
+ Streamlit
41
+ langchain
42
+ langchain-community
43
+ utills
44
+ transformers
45
+ Running the application
46
+
47
+ Install the required libraries.
48
+ Set the environment variable lang_api_key with your Langchain API key (if applicable).
49
+ Run streamlit run main.py.
50
+ Code Structure
51
+
52
+ create_retriever_from_chroma: Creates a document retriever using Chroma and the Alibaba-NLP/gte-base-en-v1.5 model for sentence similarity.
53
+ main: Defines the Streamlit application layout and functionalities.
54
+ handle_userinput: Processes user input, retrieves relevant documents, and generates a response using the compressed LLM retriever within the RAG chain.
55
+ create_conversational_rag_chain: Creates a RAG chain for processing user questions with the compressed LLM retriever.
56
+ Additional Notes
57
+
58
+ This application uses pre-trained document files. You can modify the data path to use your own documents.
59
+ The Lithuanian law documents might not be the latest versions.
60
+
61
+
62
+
63
+
64
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference