Spaces:
Running
Document-Based RAG AI
This project implements a Retrieval-Augmented Generation (RAG) architecture to extract and retrieve information from uploaded documents and answer user queries using a chat interface. The application uses a Flask-based web interface and a Chroma vector database for document indexing and retrieval.
Problem Statement
Organizations often struggle to manage and query unstructured textual data spread across various documents. This application provides an efficient solution by creating a searchable vector database of document contents, enabling precise query-based retrieval and response generation.
Setup
1. Virtual Environment
Set up a Python virtual environment to manage dependencies:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
2. Install Dependencies
Install required packages from the requirements.txt
file:
pip install -r requirements.txt
3. Running the Application
Start the Flask application:
bash python app.py
Components Overview
1. LangChain
LangChain enables seamless integration of LLMs with retrieval systems like vector databases.
2. Flask
Flask provides the web framework to build the user interface and RESTful APIs.
3. Chroma Vector Database
Chroma is used to store and retrieve document embeddings for similarity-based querying.
4. RAG Architecture
Combines retrieval of relevant document chunks with LLMs to generate precise responses based on context.
5. Models Used
- Embedding Model:
all-MiniLM-L6-v2
(via HuggingFace) - Chat Model:
Mistral-7B-Instruct-v0.3
Application Workflow
Overview
A typical RAG application has two components:
- Indexing: Processes and indexes documents for searchability.
- Retrieval and Generation: Retrieves relevant document chunks and generates a context-based response.
Indexing
- Load: Upload documents using the web interface.
- Split: Break documents into smaller chunks for indexing.
- Store: Save embeddings in a Chroma vector database.
Retrieval and Generation
- Retrieve: Search for relevant chunks based on user queries.
- Generate: Produce context-aware answers using the Chat Model.
Application Features
Create Database
- Upload documents and generate a searchable vector database.
Update Database
- Update the vector database by adding new document.
Remove Database
- Remove the vector database.
Delete Documents in Database
- Delete any specific document in the vector database.
List Databases
- View all available vector databases.
Chat Interface
- Select a vector database and interact via queries.
App Tree
.
βββ app.py # Flask application
βββ retrival.py # Data retrieval and vector database management
βββ templates/
β βββ home.html # Home page template
β βββ chat.html # Chat interface template
β βββ create_db.html # Upload documents for database creation
β βββ list_dbs.html # List available vector databases
βββ uploads/ # Uploaded document storage
βββ VectorDB/ # Vector database storage
βββ TableDB/ # Table database storage
βββ ImageDB/ # Image database storage
βββ requirements.txt # Python dependencies
βββ .env # Environment variables (e.g., HuggingFace API key)
βββ README.md # Documentation
Example Use Using the flask
- Navigate to
/create-db
to upload documents and generate a vector database (via Flask) - Navigate to
/list-db
to view all available databases. - Select a database using
/select-db/<db_name>
(Flask). - Query a database using
/chat
(Flask) to retrieve relevant information and generate a response. - Update a database using
/update-dbs/<db_name>
(Flask) to update db by adding the files or even whole folder containing the files. - Remove a database using
/remove-dbs/<db_name>
(Flask) to remove or delete entire database. - Delete document in database using
/delete-doc/<db_name>
(Flask) to delete a specific document in the database.
Happy experimenting! π