Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,16 @@ license: apache-2.0
|
|
13 |
# Chat with Lithuanian Law Documents
|
14 |
This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model.
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
## Features
|
17 |
|
18 |
Users can choose the information retrieval type (similarity or maximum marginal relevance search).
|
@@ -23,11 +33,16 @@ Technical Details
|
|
23 |
|
24 |
#### Sentence Similarity:
|
25 |
The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
|
26 |
-
Local Vector Store:
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
|
|
|
|
|
|
|
|
|
|
31 |
Benefits of Compact Design
|
32 |
|
33 |
#### Local Processing:
|
|
|
13 |
# Chat with Lithuanian Law Documents
|
14 |
This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model.
|
15 |
|
16 |
+
|
17 |
+
## Important Disclaimer
|
18 |
+
|
19 |
+
This application utilizes a lightweight large language model (LLM) called Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf to ensure smooth local processing on your device. While this model offers efficiency benefits, it comes with some limitations:
|
20 |
+
|
21 |
+
#### Potential for Hallucination: Due to its size and training data, the model might occasionally generate responses that are not entirely consistent with the provided documents or factual accuracy.
|
22 |
+
#### Character Misinterpretations: In rare instances, the model may introduce nonsensical characters, including those from the Chinese alphabet, during response generation.
|
23 |
+
We recommend keeping these limitations in mind when using the application and interpreting the provided responses.
|
24 |
+
|
25 |
+
|
26 |
## Features
|
27 |
|
28 |
Users can choose the information retrieval type (similarity or maximum marginal relevance search).
|
|
|
33 |
|
34 |
#### Sentence Similarity:
|
35 |
The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
|
36 |
+
#### Local Vector Store:
|
37 |
+
chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
|
38 |
+
#### RAG Chain with Quantized LLM:
|
39 |
+
A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
|
40 |
+
#### Lightweight LLM:
|
41 |
+
To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
|
42 |
+
#### Quantization:
|
43 |
+
This Qwen2 model leverages a technique called quantization, which reduces the model size without sacrificing significant accuracy. This quantization process makes the model more efficient to run on local hardware, contributing to a more environmentally friendly solution.
|
44 |
+
#### CPU-based Processing:
|
45 |
+
The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
|
46 |
Benefits of Compact Design
|
47 |
|
48 |
#### Local Processing:
|