Spaces:
Running
Running
File size: 10,235 Bytes
3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 f32ea1e 3e5fa74 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
description = """
## 🕉️ **Project Title: RamayanaGPT & GitaGPT – RAG-based Chatbots for Ancient Indian Epics**
---
### 🔍 **Project Overview**
**RamayanaGPT** and **GitaGPT** are knowledge-based conversational AI tools designed to answer questions from the *Valmiki Ramayana* and the *Bhagavad Gita*, respectively. These chatbots use **Retrieval-Augmented Generation (RAG)** architecture to generate accurate, scripture-based responses. They combine powerful **vector search capabilities** with **large language models (LLMs)** to deliver spiritually insightful, context-rich conversations.
These tools leverage:
* **MongoDB Atlas Vector Search** for embedding-based document retrieval
* **Hugging Face** embeddings (`intfloat/multilingual-e5-base`)
* **Groq LLaMA 3.1 8B** via API
* **LlamaIndex** for orchestration
* **Gradio** for user interface
---
### 🏗️ **Key Components**
#### 1. **Vector Store: MongoDB Atlas**
* Two collections are created in the `RAG` database:
* `ramayana` for **Valmiki Ramayana**
* `bhagavad_gita` for **Bhagavad Gita**
* Each collection contains vector indexes:
* `ramayana_vector_index`
* `gita_vector_index`
* Each document includes:
* For Ramayana: `kanda`, `sarga`, `shloka`, `shloka_text`, and `explanation`
* For Gita: `Title`, `Chapter`, `Verse`, and `explanation`
#### 2. **Vector Embedding: Hugging Face**
* Model: `intfloat/multilingual-e5-base`
* Used to convert `shloka_text + explanation` or `verse + explanation` into vector representations
* These embeddings are indexed into MongoDB for semantic similarity search
#### 3. **Language Model: Groq API**
* LLM used: `llama-3.1-8b-instant` via **Groq API**
* Users input their Groq API key securely
* LLM is instantiated per query using `llama_index.llms.groq.Groq`
#### 4. **Prompt Engineering**
* Custom **PromptTemplates** guide the response structure for each chatbot
* **RamayanaGPT Prompt**:
* Introduction to the query
* Related shlokas with explanations
* Summary/overview
* **GitaGPT Prompt**:
* Context or spiritual background
* Relevant verse(s) with meaning
* Reflective conclusion
#### 5. **Index Initialization**
* Vector indexes are loaded **once** at application startup:
```python
ramayana_index = get_vector_index("RAG", "ramayana", "ramayana_vector_index")
gita_index = get_vector_index("RAG", "bhagavad_gita", "gita_vector_index")
```
* Shared across all user queries for speed and efficiency
#### 6. **User Interface: Gradio**
* Built with `gr.Blocks` using the `Soft` theme and `Roboto Mono` font
* Two tabs:
* 🕉️ **RamayanaGPT**
* 🕉️ **GitaGPT**
* Users enter their Groq API key once; it's stored in `gr.State`
* Upon authentication:
* API key input and help accordion are hidden
* Full chat interface is revealed (`gr.ChatInterface`)
---
### ⚙️ **Technical Stack**
| Component | Technology |
| --------------- | ------------------------------------- |
| Backend LLM | Groq (LLaMA 3.1 8B via API) |
| Embedding Model | Hugging Face (`multilingual-e5-base`) |
| Vector Store | MongoDB Atlas Vector Search |
| Vector Engine | LlamaIndex VectorStoreIndex |
| Prompt Engine | LlamaIndex PromptTemplate |
| Query Engine | LlamaIndex Query Engine |
| UI Framework | Gradio (Blocks + ChatInterface) |
| Deployment | Python app using `app.py` |
---
### ✅ **Features Implemented**
* [x] Vector search using MongoDB Atlas
* `ramayana_vector_index` for Valmiki Ramayana
* `gita_vector_index` for Bhagavad Gita
* [x] Hugging Face embedding (`e5-base`) integration
* [x] API key input and session handling with `gr.State`
* [x] LLM integration via Groq API
* [x] Prompt templates customized for each scripture
* [x] Tabbed interface for seamless switching between RamayanaGPT and GitaGPT
* [x] Clean UX with collapsible Groq API key instructions
* [x] Logging of each query with timestamp (for debugging/monitoring)
"""
groq_api_key = """
### 🔑 How to Get a Groq API Key
1. **Go to** [https://console.groq.com/keys](https://console.groq.com/keys)
2. **Log in or Sign Up** for a Groq account.
3. Click **"API Keys"** from the dashboard.
4. Click **"Create Key"**, name it, and generate.
5. **Copy the API key** and store it securely.
6. **Paste** the key into the RamayanaGPT app to start chatting.
---
⚠️ **Don't share** your API key. Revoke and regenerate if needed.
"""
RamayanaGPT='''
## 🕉️ **RamayanaGPT – Overview and Dataset Summary**
### 📖 **Introduction**
**RamayanaGPT** is a RAG-based chatbot that draws upon the **Valmiki Ramayana**, the original Sanskrit epic, to answer user queries with reference to shlokas and their commentaries. It aims to offer precise, contextual, and respectful responses using advanced retrieval and generation technologies.
### 🗂️ **Dataset Structure**
The uploaded Ramayana dataset includes the following columns:
| Column | Description |
| ------------- | ------------------------------------------------------------------------------ |
| `kanda` | One of the 7 books (kandas) of the Ramayana (e.g., Bala Kanda, Ayodhya Kanda). |
| `sarga` | The chapter number within each kanda. |
| `shloka` | The shloka (verse) number within the sarga. |
| `shloka_text` | Original Sanskrit verse. |
| `explanation` | English explanation or interpretation of the shloka. |
### 🔍 **Example**
```text
kanda: Bala Kanda
sarga: 1
shloka: 1
shloka_text: तपस्स्वाध्यायनिरतं तपस्वी वाग्विदां वरम् ।
explanation: Ascetic Valmiki enquired of Narada, preeminent among sages, who was engaged in penance and study of the Vedas.
```
### 💡 **Insights**
* The data is well-structured with nearly **1,400+** records.
* Each record reflects a deep philosophical or narrative moment from the epic.
* Metadata (`kanda`, `sarga`, `shloka`) allows precise retrieval and organization.
* Used for vector indexing and semantic retrieval.
'''
GitaGPT='''
## 🕉️ **GitaGPT – Overview and Dataset Summary**
### 📖 **Introduction**
**GitaGPT** is a chatbot built to answer spiritual and philosophical questions using the **Bhagavad Gita** as its primary source. It references verses (slokas) directly from the Gita, delivering insights supported by both Sanskrit, Hindi, and English explanations.
### 🗂️ **Dataset Structure**
The uploaded Gita dataset contains the following fields:
| Column | Description |
| --------------------- | --------------------------------------------------- |
| `S.No.` | Serial number of the verse. |
| `Title` | Title of the chapter (e.g., Arjuna's Vishada Yoga). |
| `Chapter` | Gita chapter number (e.g., Chapter 1). |
| `Verse` | Verse ID (e.g., Verse 1.1). |
| `Sanskrit Anuvad` | Original verse in Devanagari Sanskrit. |
| `Hindi Anuvad` | Hindi translation/interpretation. |
| `Enlgish Translation` | English translation/interpretation. |
### 🔍 **Example**
```text
Chapter: Chapter 1
Verse: Verse 1.1
Sanskrit: धृतराष्ट्र उवाच । धर्मक्षेत्रे कुरुक्षेत्रे समवेता युयुत्सवः...
Hindi: धृतराष्ट्र बोले- हे संजय! धर्मभूमि कुरुक्षेत्र में एकत्र हुए युद्ध की इच्छा रखने वाले...
English: Dhrtarashtra asked of Sanjaya: O SANJAYA, what did my sons and the sons of Pandu do?
```
### 💡 **Insights**
* The dataset contains **700+ verses** from all 18 chapters.
* Multilingual representation (Sanskrit, Hindi, English) enhances usability for diverse users.
* The verse structure (`Chapter`, `Verse`) aids in precise referencing and response generation.
* Perfectly suited for semantic search via vector embeddings.
'''
footer = """
<div style="background-color: #1d2938; color: white; padding: 10px; width: 100%; bottom: 0; left: 0; display: flex; justify-content: space-between; align-items: center; padding: .2rem 35px; box-sizing: border-box; font-size: 16px;">
<div style="text-align: left;">
<p style="margin: 0;">© 2025 </p>
</div>
<div style="text-align: center; flex-grow: 1;">
<p style="margin: 0;"> This website is made with ❤ by SARATH CHANDRA</p>
</div>
<div class="social-links" style="display: flex; gap: 20px; justify-content: flex-end; align-items: center;">
<a href="https://github.com/21bq1a4210" target="_blank" style="text-align: center;">
<img src="data:image/png;base64,{}" alt="GitHub" width="40" height="40" style="display: block; margin: 0 auto;">
<span style="font-size: 14px;">GitHub</span>
</a>
<a href="https://www.linkedin.com/in/sarath-chandra-bandreddi-07393b1aa/" target="_blank" style="text-align: center;">
<img src="data:image/png;base64,{}" alt="LinkedIn" width="40" height="40" style="display: block; margin: 0 auto;">
<span style="font-size: 14px;">LinkedIn</span>
</a>
<a href="https://21bq1a4210.github.io/MyPortfolio-/" target="_blank" style="text-align: center;">
<img src="data:image/png;base64,{}" alt="Portfolio" width="40" height="40" style="display: block; margin-right: 40px;">
<span style="font-size: 14px;">Portfolio</span>
</a>
</div>
</div>
""" |