Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,55 @@ pinned: false
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
+
Chat with Lithuanian Law Documents
|
14 |
+
This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model.
|
15 |
+
|
16 |
+
Features
|
17 |
+
|
18 |
+
Users can choose the information retrieval type (similarity or maximum marginal relevance search).
|
19 |
+
Users can specify the number of documents to retrieve.
|
20 |
+
Users can ask questions about the provided documents.
|
21 |
+
The virtual assistant provides answers based on the retrieved documents and a powerful, yet environmentally friendly, large language model (LLM).
|
22 |
+
Technical Details
|
23 |
+
|
24 |
+
Sentence Similarity: The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
|
25 |
+
Local Vector Store: chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
|
26 |
+
RAG Chain with Quantized LLM: A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
|
27 |
+
Lightweight LLM: To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
|
28 |
+
Quantization: This Qwen2 model leverages a technique called quantization, which reduces the model size without sacrificing significant accuracy. This quantization process makes the model more efficient to run on local hardware, contributing to a more environmentally friendly solution.
|
29 |
+
CPU-based Processing: The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
|
30 |
+
Benefits of Compact Design
|
31 |
+
|
32 |
+
Local Processing: The compact size of the LLM and the application itself enable local processing on your device, reducing reliance on cloud-based resources and associated environmental impact.
|
33 |
+
Mobile Potential: Due to its small footprint, this application has the potential to be adapted for mobile devices, bringing legal information access to a wider audience.
|
34 |
+
Adaptability of Qwen2 0.5B
|
35 |
+
|
36 |
+
Fine-tuning: While the Qwen2 0.5B model is powerful for its size, it can be further enhanced through fine-tuning on specific legal datasets or domains, potentially improving its understanding of Lithuanian legal terminology and nuances.
|
37 |
+
Conversation Style: Depending on user needs and desired conversation style, alternative pre-trained models could be explored, potentially offering a trade-off between model size and specific capabilities.
|
38 |
+
Requirements
|
39 |
+
|
40 |
+
Streamlit
|
41 |
+
langchain
|
42 |
+
langchain-community
|
43 |
+
utills
|
44 |
+
transformers
|
45 |
+
Running the application
|
46 |
+
|
47 |
+
Install the required libraries.
|
48 |
+
Set the environment variable lang_api_key with your Langchain API key (if applicable).
|
49 |
+
Run streamlit run main.py.
|
50 |
+
Code Structure
|
51 |
+
|
52 |
+
create_retriever_from_chroma: Creates a document retriever using Chroma and the Alibaba-NLP/gte-base-en-v1.5 model for sentence similarity.
|
53 |
+
main: Defines the Streamlit application layout and functionalities.
|
54 |
+
handle_userinput: Processes user input, retrieves relevant documents, and generates a response using the compressed LLM retriever within the RAG chain.
|
55 |
+
create_conversational_rag_chain: Creates a RAG chain for processing user questions with the compressed LLM retriever.
|
56 |
+
Additional Notes
|
57 |
+
|
58 |
+
This application uses pre-trained document files. You can modify the data path to use your own documents.
|
59 |
+
The Lithuanian law documents might not be the latest versions.
|
60 |
+
|
61 |
+
|
62 |
+
|
63 |
+
|
64 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|