Epic-Minds / markdown.py
Sarath0x8f's picture
Upload 2 files
f32ea1e verified
raw
history blame
10.2 kB
description = """
## 🕉️ **Project Title: RamayanaGPT & GitaGPT – RAG-based Chatbots for Ancient Indian Epics**
---
### 🔍 **Project Overview**
**RamayanaGPT** and **GitaGPT** are knowledge-based conversational AI tools designed to answer questions from the *Valmiki Ramayana* and the *Bhagavad Gita*, respectively. These chatbots use **Retrieval-Augmented Generation (RAG)** architecture to generate accurate, scripture-based responses. They combine powerful **vector search capabilities** with **large language models (LLMs)** to deliver spiritually insightful, context-rich conversations.
These tools leverage:
* **MongoDB Atlas Vector Search** for embedding-based document retrieval
* **Hugging Face** embeddings (`intfloat/multilingual-e5-base`)
* **Groq LLaMA 3.1 8B** via API
* **LlamaIndex** for orchestration
* **Gradio** for user interface
---
### 🏗️ **Key Components**
#### 1. **Vector Store: MongoDB Atlas**
* Two collections are created in the `RAG` database:
* `ramayana` for **Valmiki Ramayana**
* `bhagavad_gita` for **Bhagavad Gita**
* Each collection contains vector indexes:
* `ramayana_vector_index`
* `gita_vector_index`
* Each document includes:
* For Ramayana: `kanda`, `sarga`, `shloka`, `shloka_text`, and `explanation`
* For Gita: `Title`, `Chapter`, `Verse`, and `explanation`
#### 2. **Vector Embedding: Hugging Face**
* Model: `intfloat/multilingual-e5-base`
* Used to convert `shloka_text + explanation` or `verse + explanation` into vector representations
* These embeddings are indexed into MongoDB for semantic similarity search
#### 3. **Language Model: Groq API**
* LLM used: `llama-3.1-8b-instant` via **Groq API**
* Users input their Groq API key securely
* LLM is instantiated per query using `llama_index.llms.groq.Groq`
#### 4. **Prompt Engineering**
* Custom **PromptTemplates** guide the response structure for each chatbot
* **RamayanaGPT Prompt**:
* Introduction to the query
* Related shlokas with explanations
* Summary/overview
* **GitaGPT Prompt**:
* Context or spiritual background
* Relevant verse(s) with meaning
* Reflective conclusion
#### 5. **Index Initialization**
* Vector indexes are loaded **once** at application startup:
```python
ramayana_index = get_vector_index("RAG", "ramayana", "ramayana_vector_index")
gita_index = get_vector_index("RAG", "bhagavad_gita", "gita_vector_index")
```
* Shared across all user queries for speed and efficiency
#### 6. **User Interface: Gradio**
* Built with `gr.Blocks` using the `Soft` theme and `Roboto Mono` font
* Two tabs:
* 🕉️ **RamayanaGPT**
* 🕉️ **GitaGPT**
* Users enter their Groq API key once; it's stored in `gr.State`
* Upon authentication:
* API key input and help accordion are hidden
* Full chat interface is revealed (`gr.ChatInterface`)
---
### ⚙️ **Technical Stack**
| Component | Technology |
| --------------- | ------------------------------------- |
| Backend LLM | Groq (LLaMA 3.1 8B via API) |
| Embedding Model | Hugging Face (`multilingual-e5-base`) |
| Vector Store | MongoDB Atlas Vector Search |
| Vector Engine | LlamaIndex VectorStoreIndex |
| Prompt Engine | LlamaIndex PromptTemplate |
| Query Engine | LlamaIndex Query Engine |
| UI Framework | Gradio (Blocks + ChatInterface) |
| Deployment | Python app using `app.py` |
---
### ✅ **Features Implemented**
* [x] Vector search using MongoDB Atlas
* `ramayana_vector_index` for Valmiki Ramayana
* `gita_vector_index` for Bhagavad Gita
* [x] Hugging Face embedding (`e5-base`) integration
* [x] API key input and session handling with `gr.State`
* [x] LLM integration via Groq API
* [x] Prompt templates customized for each scripture
* [x] Tabbed interface for seamless switching between RamayanaGPT and GitaGPT
* [x] Clean UX with collapsible Groq API key instructions
* [x] Logging of each query with timestamp (for debugging/monitoring)
"""
groq_api_key = """
### 🔑 How to Get a Groq API Key
1. **Go to** [https://console.groq.com/keys](https://console.groq.com/keys)
2. **Log in or Sign Up** for a Groq account.
3. Click **"API Keys"** from the dashboard.
4. Click **"Create Key"**, name it, and generate.
5. **Copy the API key** and store it securely.
6. **Paste** the key into the RamayanaGPT app to start chatting.
---
⚠️ **Don't share** your API key. Revoke and regenerate if needed.
"""
RamayanaGPT='''
## 🕉️ **RamayanaGPT – Overview and Dataset Summary**
### 📖 **Introduction**
**RamayanaGPT** is a RAG-based chatbot that draws upon the **Valmiki Ramayana**, the original Sanskrit epic, to answer user queries with reference to shlokas and their commentaries. It aims to offer precise, contextual, and respectful responses using advanced retrieval and generation technologies.
### 🗂️ **Dataset Structure**
The uploaded Ramayana dataset includes the following columns:
| Column | Description |
| ------------- | ------------------------------------------------------------------------------ |
| `kanda` | One of the 7 books (kandas) of the Ramayana (e.g., Bala Kanda, Ayodhya Kanda). |
| `sarga` | The chapter number within each kanda. |
| `shloka` | The shloka (verse) number within the sarga. |
| `shloka_text` | Original Sanskrit verse. |
| `explanation` | English explanation or interpretation of the shloka. |
### 🔍 **Example**
```text
kanda: Bala Kanda
sarga: 1
shloka: 1
shloka_text: तपस्स्वाध्यायनिरतं तपस्वी वाग्विदां वरम् ।
explanation: Ascetic Valmiki enquired of Narada, preeminent among sages, who was engaged in penance and study of the Vedas.
```
### 💡 **Insights**
* The data is well-structured with nearly **1,400+** records.
* Each record reflects a deep philosophical or narrative moment from the epic.
* Metadata (`kanda`, `sarga`, `shloka`) allows precise retrieval and organization.
* Used for vector indexing and semantic retrieval.
'''
GitaGPT='''
## 🕉️ **GitaGPT – Overview and Dataset Summary**
### 📖 **Introduction**
**GitaGPT** is a chatbot built to answer spiritual and philosophical questions using the **Bhagavad Gita** as its primary source. It references verses (slokas) directly from the Gita, delivering insights supported by both Sanskrit, Hindi, and English explanations.
### 🗂️ **Dataset Structure**
The uploaded Gita dataset contains the following fields:
| Column | Description |
| --------------------- | --------------------------------------------------- |
| `S.No.` | Serial number of the verse. |
| `Title` | Title of the chapter (e.g., Arjuna's Vishada Yoga). |
| `Chapter` | Gita chapter number (e.g., Chapter 1). |
| `Verse` | Verse ID (e.g., Verse 1.1). |
| `Sanskrit Anuvad` | Original verse in Devanagari Sanskrit. |
| `Hindi Anuvad` | Hindi translation/interpretation. |
| `Enlgish Translation` | English translation/interpretation. |
### 🔍 **Example**
```text
Chapter: Chapter 1
Verse: Verse 1.1
Sanskrit: धृतराष्ट्र उवाच । धर्मक्षेत्रे कुरुक्षेत्रे समवेता युयुत्सवः...
Hindi: धृतराष्ट्र बोले- हे संजय! धर्मभूमि कुरुक्षेत्र में एकत्र हुए युद्ध की इच्छा रखने वाले...
English: Dhrtarashtra asked of Sanjaya: O SANJAYA, what did my sons and the sons of Pandu do?
```
### 💡 **Insights**
* The dataset contains **700+ verses** from all 18 chapters.
* Multilingual representation (Sanskrit, Hindi, English) enhances usability for diverse users.
* The verse structure (`Chapter`, `Verse`) aids in precise referencing and response generation.
* Perfectly suited for semantic search via vector embeddings.
'''
footer = """
<div style="background-color: #1d2938; color: white; padding: 10px; width: 100%; bottom: 0; left: 0; display: flex; justify-content: space-between; align-items: center; padding: .2rem 35px; box-sizing: border-box; font-size: 16px;">
<div style="text-align: left;">
<p style="margin: 0;">&copy; 2025 </p>
</div>
<div style="text-align: center; flex-grow: 1;">
<p style="margin: 0;"> This website is made with ❤ by SARATH CHANDRA</p>
</div>
<div class="social-links" style="display: flex; gap: 20px; justify-content: flex-end; align-items: center;">
<a href="https://github.com/21bq1a4210" target="_blank" style="text-align: center;">
<img src="data:image/png;base64,{}" alt="GitHub" width="40" height="40" style="display: block; margin: 0 auto;">
<span style="font-size: 14px;">GitHub</span>
</a>
<a href="https://www.linkedin.com/in/sarath-chandra-bandreddi-07393b1aa/" target="_blank" style="text-align: center;">
<img src="data:image/png;base64,{}" alt="LinkedIn" width="40" height="40" style="display: block; margin: 0 auto;">
<span style="font-size: 14px;">LinkedIn</span>
</a>
<a href="https://21bq1a4210.github.io/MyPortfolio-/" target="_blank" style="text-align: center;">
<img src="data:image/png;base64,{}" alt="Portfolio" width="40" height="40" style="display: block; margin-right: 40px;">
<span style="font-size: 14px;">Portfolio</span>
</a>
</div>
</div>
"""