ArturG9 commited on
Commit
9672200
·
verified ·
1 Parent(s): cf7a565

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -5
README.md CHANGED
@@ -13,6 +13,16 @@ license: apache-2.0
13
  # Chat with Lithuanian Law Documents
14
  This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model.
15
 
 
 
 
 
 
 
 
 
 
 
16
  ## Features
17
 
18
  Users can choose the information retrieval type (similarity or maximum marginal relevance search).
@@ -23,11 +33,16 @@ Technical Details
23
 
24
  #### Sentence Similarity:
25
  The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
26
- Local Vector Store: chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
27
- RAG Chain with Quantized LLM: A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
28
- Lightweight LLM: To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
29
- Quantization: This Qwen2 model leverages a technique called quantization, which reduces the model size without sacrificing significant accuracy. This quantization process makes the model more efficient to run on local hardware, contributing to a more environmentally friendly solution.
30
- CPU-based Processing: The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
 
 
 
 
 
31
  Benefits of Compact Design
32
 
33
  #### Local Processing:
 
13
  # Chat with Lithuanian Law Documents
14
  This is a README file for a Streamlit application that allows users to chat with a virtual assistant based on Lithuanian law documents, leveraging local processing power and a compact language model.
15
 
16
+
17
+ ## Important Disclaimer
18
+
19
+ This application utilizes a lightweight large language model (LLM) called Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf to ensure smooth local processing on your device. While this model offers efficiency benefits, it comes with some limitations:
20
+
21
+ #### Potential for Hallucination: Due to its size and training data, the model might occasionally generate responses that are not entirely consistent with the provided documents or factual accuracy.
22
+ #### Character Misinterpretations: In rare instances, the model may introduce nonsensical characters, including those from the Chinese alphabet, during response generation.
23
+ We recommend keeping these limitations in mind when using the application and interpreting the provided responses.
24
+
25
+
26
  ## Features
27
 
28
  Users can choose the information retrieval type (similarity or maximum marginal relevance search).
 
33
 
34
  #### Sentence Similarity:
35
  The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
36
+ #### Local Vector Store:
37
+ chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
38
+ #### RAG Chain with Quantized LLM:
39
+ A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
40
+ #### Lightweight LLM:
41
+ To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
42
+ #### Quantization:
43
+ This Qwen2 model leverages a technique called quantization, which reduces the model size without sacrificing significant accuracy. This quantization process makes the model more efficient to run on local hardware, contributing to a more environmentally friendly solution.
44
+ #### CPU-based Processing:
45
+ The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
46
  Benefits of Compact Design
47
 
48
  #### Local Processing: