Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.48.1
title: RAG BITS Tutor
emoji: π
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.45.1
app_file: app.py
pinned: false
RAG Study Tutor for Business IT Strategy
Author: Laurel Mayer Module: AI Applications (w.3KIA) - Project 3
1. Project Description
This project implements a Retrieval Augmented Generation (RAG) application designed to act as a "Study Tutor" for the subject "Business IT Strategy." The primary goal is to enable users to ask questions about specific course content and receive well-founded, context-based answers derived from the provided lecture materials and case studies. The application integrates a retrieval component for searching relevant text passages with a Large Language Model (LLM) for generating the final answers.
Name & URL
Name | URL |
---|---|
Code | GitHub Repository |
Embedding Model Page | Sahajtomar/German-semantic |
LLM Provider (Groq) | Groq |
Jupyter Notebook | main_project.ipynb |
FAISS Index & Chunks | /faiss_index_bits/ |
2. Data Sources
The knowledge base for the RAG Tutor consists of:
Data Source | Description |
---|---|
Own course materials (Lecture PDFs) | 13 PDF documents comprising lecture notes and case studies (including solutions) for the "Business IT Strategy" course. |
(Total extracted text volume: approx. 221,049 characters) | The files are located in the data/ folder of this repository. |
3. RAG Improvements
To enhance the RAG system's performance, the following adaptation was implemented:
Improvement | Description |
---|---|
Query Expansion (using LLM) |
The original user query is sent to an LLM (llama3-8b-8192 via Groq) to generate 2-3 alternative formulations or relevant keywords. These expanded queries are then additionally used for retrieval to create a broader contextual base for final answer generation. The implementation and evaluation of this method are detailed in Section 5 of the Jupyter Notebook (main_project.ipynb ). |
Other Potential Improvements | For this project, the focus was on implementing and evaluating Query Expansion. Further potential improvements and adaptation mechanisms (e.g., re-ranking of search results, hybrid search) are discussed in the "Conclusion and Outlook" section of this document and in the notebook. |
4. Chunking
Data Chunking Method
The choice of chunking strategy is crucial for retrieval quality, as it determines how context is divided and fed to the embedding model. For this project, the text extracted from the PDFs was chunked as follows:
Type of Chunking | Configuration | Result (Number of Chunks) |
---|---|---|
RecursiveCharacterTextSplitter (Langchain) - Chosen Method |
Chunk Size: 1500 characters, Overlap: 200 characters | 203 |
Reasoning for the chosen method:
The RecursiveCharacterTextSplitter
was selected because it attempts to maintain semantically coherent blocks by recursively splitting at various separators (like paragraphs, sentences, etc.). A chunk_size
of 1500 characters with an overlap
of 200 characters was chosen as a good starting point. The goal was to obtain chunks that contain sufficient context for understanding but are not so large as to exceed the maximum input length of embedding models or introduce too much noise for specific queries. The resulting 203 chunks represented a manageable quantity for further processing.
Alternatively Considered Chunking Approaches:
Type of Chunking | Hypothetical Configuration/Consideration | Potential Advantages/Disadvantages |
---|---|---|
CharacterTextSplitter (Langchain) |
Fixed Chunk Size (e.g., 1000), Overlap (e.g., 150) | Simpler, but less regard for semantic boundaries; could split sentences/thoughts. |
SentenceTransformersTokenTextSplitter |
Based on token limits of the embedding model (e.g., 256 Tokens) | More precise adaptation to the embedding model, but requires knowledge of tokenizer specifics. Could have led to a different number and granularity of chunks. |
Smaller chunk_size with RecursiveCharacterTextSplitter |
e.g., 500 characters, Overlap 50 | More, but more specific chunks. Could help with very detailed questions, but also fragment context more and require more chunks for an answer. |
Decision Process: Although other methods and configurations exist, the initial configuration of the RecursiveCharacterTextSplitter
was retained for this project as it offered a good compromise between implementation effort, context preservation, and the resulting number of chunks for the chosen dataset. Deeper optimization of the chunking strategy would be a potential next step in further development to potentially enhance retrieval accuracy. The documentation of this project focuses on the overall process and the implementation of a core RAG pipeline with one form of adaptation.
5. Choice of LLM
LLMs accessed via the Groq API were used for this RAG application:
LLM Name (Groq) | Used for | Link/Reference |
---|---|---|
llama3-70b-8192 |
Final Answer Generation | Groq Models |
llama3-8b-8192 |
Query Expansion (Adaptation Mechanism) | Groq Models |
Reasoning: llama3-70b-8192
was chosen for answer generation due to its strong performance in synthesizing information and generating coherent text. For query expansion, the smaller llama3-8b-8192
model was used to reduce the latency of this intermediate step, while still expecting good quality expansions.
6. Test Method
The evaluation of the RAG application and the query expansion mechanism was conducted qualitatively. Specific test questions regarding the content of the course materials were formulated. The procedure was as follows:
- Generate an answer based on the original user query and the chunks retrieved directly for it.
- Generate expanded search queries from the original user query using an LLM.
- Retrieve chunks based on these expanded queries, then collect and de-duplicate them to form an expanded context.
- Generate an answer based on the expanded context and the original user query.
- Conduct a qualitative comparison of the two generated answers in terms of depth of detail, correctness, and relevance to the context. The hypothesis was that query expansion could lead to more comprehensive and precise answers.
Detailed test cases and results are documented in the Jupyter Notebook (main_project.ipynb
) in Sections 4.2 and 5.
7. Results
As the evaluation was primarily qualitative, the main observations are summarized here. Detailed examples can be found in the notebook.
Model/Method | Observation |
---|---|
Base RAG (Original Query) | Provides precise and good answers for direct questions (e.g., for "Was ist eine IT-Strategie?"). |
RAG with Query Expansion | For the question "Was ist eine IT-Strategie?", there was hardly any difference compared to the base RAG. For the question "Welche Rolle spielt IT-Governance?", query expansion led to a visibly more detailed and comprehensive answer, which included additional relevant aspects. |
Conclusion of Results: Query expansion can improve answer quality by providing a broader and more relevant context for the LLM. However, the added value is highly dependent on the initial question and the quality of the generated expansions.
8. Setup and Execution
To run this project locally:
- Prerequisites:
- Python 3.10 or higher (Python 3.12 was used).
- Git.
- Clone Repository:
(Replace with your actual repository URL if different)git clone [https://github.com/patronlaurel/RAG-BITS-Tutor.git](https://github.com/patronlaurel/RAG-BITS-Tutor.git) cd RAG-BITS-Tutor
- Create and Activate Virtual Environment:
python -m venv .venv # Windows: .\.venv\Scripts\activate # macOS/Linux: source .venv/bin/activate
- Install Dependencies:
(Thepip install -r requirements.txt
requirements.txt
file was generated usingpip freeze > requirements.txt
in the project and is included in the repository). - Set up API Key:
- Create a file named
.env
in the project's root directory. - Add your Groq API key:
GROQ_API_KEY=your_groq_api_key
- Create a file named
- Start Jupyter Notebook:
Then open the notebookjupyter lab
main_project.ipynb
. The PDF data must be placed in thedata/
folder. The FAISS index (faiss_index_bits/bits_tutor.index
) and chunks (faiss_index_bits/bits_chunks.pkl
) are created and saved during the first run of Section 3.4 in the notebook.
9. Technologies and Libraries Used
- Python 3.12
- Jupyter Lab
- Langchain
- Sentence Transformers (
Sahajtomar/German-semantic
) - FAISS (Facebook AI Similarity Search)
- Groq API (
llama3-70b-8192
,llama3-8b-8192
) - PyPDF2
- NumPy
- Dotenv
- Tqdm
10. Conclusion and Outlook
(Summarize the key points from Section 7 of your notebook. Example:) This project successfully demonstrated the construction of a RAG application as a "Study Tutor." By implementing LLM-based query expansion, it was shown how the depth of detail and informational content of answers can be improved for certain queries. Key insights relate to the importance of data quality, appropriate model selection, and the potential of adaptation mechanisms. Future work could focus on extended evaluation methods, exploring further adaptation techniques like re-ranking, or developing an interactive user interface.
11. References
(List any specific scientific papers, blog posts, or other sources you heavily relied on for your methodology or understanding here. For this project, direct code references and general methodology are likely sufficient.)