RAG_AI_V2

Build error

App Files Files Community

WebashalarForML commited on Feb 5

Commit

533a593

verified ·

1 Parent(s): 05a3944

Update README2.md

Browse files

Files changed (1) hide show

README2.md +130 -0

README2.md CHANGED Viewed

	@@ -0,0 +1,130 @@

+# Document-Based RAG AI
+This project implements a Retrieval-Augmented Generation (RAG) architecture to extract and retrieve information from uploaded documents and answer user queries using a chat interface. The application uses a Flask-based web interface and a Chroma vector database for document indexing and retrieval.
+---
+## Problem Statement
+Organizations often struggle to manage and query unstructured textual data spread across various documents. This application provides an efficient solution by creating a searchable vector database of document contents, enabling precise query-based retrieval and response generation.
+---
+## Setup
+### 1. Virtual Environment
+Set up a Python virtual environment to manage dependencies:
+```bash
+python -m venv venv
+source venv/bin/activate  # On Windows: venv\Scripts\activate
+```
+### 2. Install Dependencies
+Install required packages from the `requirements.txt` file:
+```bash
+pip install -r requirements.txt
+```
+### 3. Running the Application
+Start the Flask application:
+```bash
+python app.py
+```
+---
+## Components Overview
+### 1. [LangChain](https://github.com/hwchase17/langchain)
+LangChain enables seamless integration of LLMs with retrieval systems like vector databases.
+### 2. [Flask](https://flask.palletsprojects.com/)
+Flask provides the web framework to build the user interface and RESTful APIs.
+### 3. [Chroma Vector Database](https://docs.trychroma.com/)
+Chroma is used to store and retrieve document embeddings for similarity-based querying.
+### 4. RAG Architecture
+Combines retrieval of relevant document chunks with LLMs to generate precise responses based on context.
+### 5. Models Used
+  - **Embedding Model:** `all-MiniLM-L6-v2` (via HuggingFace)
+  - **Chat Model:** `Mistral-7B-Instruct-v0.3`
+---
+## Application Workflow
+### Overview
+A typical RAG application has two components:
+1. **Indexing**: Processes and indexes documents for searchability.
+2. **Retrieval and Generation**: Retrieves relevant document chunks and generates a context-based response.
+#### Indexing
+1. **Load**: Upload documents using the web interface.
+2. **Split**: Break documents into smaller chunks for indexing.
+3. **Store**: Save embeddings in a Chroma vector database.
+#### Retrieval and Generation
+1. **Retrieve**: Search for relevant chunks based on user queries.
+2. **Generate**: Produce context-aware answers using the Chat Model.
+---
+## Application Features
+1. **Create Database**
+   - Upload documents and generate a searchable vector database.
+2. **Update Database**
+   - Update the vector database by adding new document.
+3. **Remove Database**
+   - Remove the vector database.
+4. **Delete Documents in Database**
+   - Delete any specific document in the vector database.
+5. **List Databases**
+   - View all available vector databases.
+6. **Chat Interface**
+   - Select a vector database and interact via queries.
+---
+## App Tree
+```
+.
+├── app.py                    # Flask application
+├── retrival.py               # Data retrieval and vector database management
+├── templates/
+│   ├── home.html             # Home page template
+│   ├── chat.html             # Chat interface template
+│   ├── create_db.html        # Upload documents for database creation
+│   ├── list_dbs.html         # List available vector databases
+├── uploads/                  # Uploaded document storage
+├── VectorDB/                 # Vector database storage
+├── TableDB/                  # Table database storage
+├── ImageDB/                  # Image database storage
+├── requirements.txt          # Python dependencies
+├── .env                      # Environment variables (e.g., HuggingFace API key)
+└── README.md                 # Documentation
+```
+---
+## Example Use Using the flask
+1. **Navigate to `/create-db`** to upload documents and generate a vector database (via Flask)
+2. **Navigate to `/list-db`**  to view all available databases.
+3. **Select a database** using `/select-db/<db_name>` (Flask).
+4. **Query a database** using `/chat` (Flask) to retrieve relevant information and generate a response.
+5. **Update a database** using `/update-dbs/<db_name>` (Flask) to update db by adding the files or even whole folder containing the files.
+6. **Remove a database** using `/remove-dbs/<db_name>` (Flask) to remove or delete entire database.
+7. **Delete document in database** using `/delete-doc/<db_name>` (Flask) to delete a specific document in the database.
+---
+Happy experimenting! 🚀