--- license: apache-2.0 title: Multi-Model-Rag sdk: streamlit emoji: ๐Ÿ“š colorFrom: gray colorTo: indigo --- ## ๐Ÿ“„ Multi-Modal RAG PDF Chatbot A Streamlit application that allows you to **upload a PDF**, ask questions about its content, and get accurate responses using a **Multi-Modal Retrieval-Augmented Generation (RAG)** pipeline powered by **Groq's Gemma-2 9B model**. --- ### ๐Ÿš€ Features - ๐Ÿ“ Upload any PDF - ๐Ÿ” Intelligent chunking and embedding - ๐Ÿง  Ask natural language questions about your PDF - โšก Powered by FAISS + HuggingFace + Groq LLM - ๐Ÿง  Caches session so PDF isn't reprocessed on every query --- ### ๐Ÿ› ๏ธ Installation (with `venv`) 1. **Clone the repo:** ```bash git clone https://github.com/Warishayat/Multimodel-Rag-Application01.git cd Multimodal-Rag-Application01 ``` 2. **Create and activate a virtual environment:** ```bash python -m venv venv # Activate: # On Windows venv\Scripts\activate # On macOS/Linux source venv/bin/activate ``` 3. **Install dependencies:** ```bash pip install -r requirements.txt ``` 4. **Set up your `.env` file:** Create a `.env` file in the root directory: ``` GROQ_API_KEY=your_groq_api_key_here ``` --- ### ๐Ÿ“ฆ Project Structure ``` ๐Ÿ“ Multimodal-Rag-Application01 โ”œโ”€โ”€ main.py # Streamlit frontend โ”œโ”€โ”€ pdfparsing.py # PDF parser using pymupdf4llm โ”œโ”€โ”€ Datapreprocessing.py # Chunking & text cleaning โ”œโ”€โ”€ vectorstore.py # Embedding & FAISS logic โ”œโ”€โ”€ .env # API keys โ”œโ”€โ”€ requirements.txt # Python dependencies โ””โ”€โ”€ README.md # You're here! ``` --- ### โ–ถ๏ธ Run the App ```bash streamlit run main.py ``` Then open `http://localhost:8501` in your browser. --- ### ๐Ÿงช Example Queries After uploading a PDF, try asking: - "What is the summary of section 3?" - "List all benchmarks mentioned." - "How is this model different from others?" --- ### ๐Ÿ’ก Tips - PDF is processed only once per session using `st.session_state`. - Uses `RecursiveCharacterTextSplitter` for effective chunking. - Embedding with `HuggingFaceEmbeddings`. --- ### ๐Ÿ“‹ Requirements Make sure your `requirements.txt` includes at least: ```txt streamlit python-dotenv langchain langchain-community langchain-groq faiss-cpu pymupdf4llm ``` --- ### ๐Ÿ“ฌ Credits Built with โค๏ธ by Waris Hayat Abbasi. ---