Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,8 @@ Users can ask questions about the provided documents.
|
|
21 |
The virtual assistant provides answers based on the retrieved documents and a powerful, yet environmentally friendly, large language model (LLM).
|
22 |
Technical Details
|
23 |
|
24 |
-
#### Sentence Similarity:
|
|
|
25 |
Local Vector Store: chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
|
26 |
RAG Chain with Quantized LLM: A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
|
27 |
Lightweight LLM: To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
|
@@ -29,11 +30,13 @@ Quantization: This Qwen2 model leverages a technique called quantization, which
|
|
29 |
CPU-based Processing: The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
|
30 |
Benefits of Compact Design
|
31 |
|
32 |
-
#### Local Processing:
|
|
|
33 |
Mobile Potential: Due to its small footprint, this application has the potential to be adapted for mobile devices, bringing legal information access to a wider audience.
|
34 |
Adaptability of Qwen2 0.5B
|
35 |
|
36 |
-
#### Fine-tuning:
|
|
|
37 |
Conversation Style: Depending on user needs and desired conversation style, alternative pre-trained models could be explored, potentially offering a trade-off between model size and specific capabilities.
|
38 |
#### Requirements
|
39 |
|
|
|
21 |
The virtual assistant provides answers based on the retrieved documents and a powerful, yet environmentally friendly, large language model (LLM).
|
22 |
Technical Details
|
23 |
|
24 |
+
#### Sentence Similarity:
|
25 |
+
The application utilizes the Alibaba-NLP/gte-base-en-v1.5 model for efficient sentence embedding, allowing for semantic similarity comparisons between user queries and the legal documents.
|
26 |
Local Vector Store: chroma acts as a local vector store, efficiently storing and managing the document embeddings for fast retrieval.
|
27 |
RAG Chain with Quantized LLM: A Retrieval-Augmented Generation (RAG) chain is implemented to process user queries. This chain integrates two key components:
|
28 |
Lightweight LLM: To ensure local operation, the application employs a compact LLM, specifically JCHAVEROT_Qwen2-0.5B-Chat_SFT_DPO.Q8_gguf, with only 0.5 billion parameters. This LLM is specifically designed for question answering tasks.
|
|
|
30 |
CPU-based Processing: The entire application is currently implemented to function entirely on your CPU. While utilizing a GPU could significantly improve processing speed, this CPU-based approach allows the application to run effectively on a wider range of devices.
|
31 |
Benefits of Compact Design
|
32 |
|
33 |
+
#### Local Processing:
|
34 |
+
The compact size of the LLM and the application itself enable local processing on your device, reducing reliance on cloud-based resources and associated environmental impact.
|
35 |
Mobile Potential: Due to its small footprint, this application has the potential to be adapted for mobile devices, bringing legal information access to a wider audience.
|
36 |
Adaptability of Qwen2 0.5B
|
37 |
|
38 |
+
#### Fine-tuning:
|
39 |
+
While the Qwen2 0.5B model is powerful for its size, it can be further enhanced through fine-tuning on specific legal datasets or domains, potentially improving its understanding of Lithuanian legal terminology and nuances.
|
40 |
Conversation Style: Depending on user needs and desired conversation style, alternative pre-trained models could be explored, potentially offering a trade-off between model size and specific capabilities.
|
41 |
#### Requirements
|
42 |
|