Spaces:

Sarath0x8f
/

Epic-Minds

Running

App Files Files Community

Epic-Minds / markdown.py

Sarath0x8f

Upload 2 files

f32ea1e verified 18 days ago

raw

history blame

10.2 kB

	description = """
	## 🕉️ Project Title: RamayanaGPT & GitaGPT – RAG-based Chatbots for Ancient Indian Epics

	---

	### 🔍 Project Overview

	RamayanaGPT and GitaGPT are knowledge-based conversational AI tools designed to answer questions from the Valmiki Ramayana and the Bhagavad Gita, respectively. These chatbots use Retrieval-Augmented Generation (RAG) architecture to generate accurate, scripture-based responses. They combine powerful vector search capabilities with large language models (LLMs) to deliver spiritually insightful, context-rich conversations.

	These tools leverage:

	* MongoDB Atlas Vector Search for embedding-based document retrieval
	* Hugging Face embeddings (`intfloat/multilingual-e5-base`)
	* Groq LLaMA 3.1 8B via API
	* LlamaIndex for orchestration
	* Gradio for user interface

	---

	### 🏗️ Key Components

	#### 1. Vector Store: MongoDB Atlas

	* Two collections are created in the `RAG` database:

	* `ramayana` for Valmiki Ramayana
	* `bhagavad_gita` for Bhagavad Gita
	* Each collection contains vector indexes:

	* `ramayana_vector_index`
	* `gita_vector_index`
	* Each document includes:

	* For Ramayana: `kanda`, `sarga`, `shloka`, `shloka_text`, and `explanation`
	* For Gita: `Title`, `Chapter`, `Verse`, and `explanation`

	#### 2. Vector Embedding: Hugging Face

	* Model: `intfloat/multilingual-e5-base`
	* Used to convert `shloka_text + explanation` or `verse + explanation` into vector representations
	* These embeddings are indexed into MongoDB for semantic similarity search

	#### 3. Language Model: Groq API

	* LLM used: `llama-3.1-8b-instant` via Groq API
	* Users input their Groq API key securely
	* LLM is instantiated per query using `llama_index.llms.groq.Groq`

	#### 4. Prompt Engineering

	* Custom PromptTemplates guide the response structure for each chatbot
	* RamayanaGPT Prompt:

	* Introduction to the query
	* Related shlokas with explanations
	* Summary/overview
	* GitaGPT Prompt:

	* Context or spiritual background
	* Relevant verse(s) with meaning
	* Reflective conclusion

	#### 5. Index Initialization

	* Vector indexes are loaded once at application startup:

	```python
	ramayana_index = get_vector_index("RAG", "ramayana", "ramayana_vector_index")
	gita_index = get_vector_index("RAG", "bhagavad_gita", "gita_vector_index")
	```
	* Shared across all user queries for speed and efficiency

	#### 6. User Interface: Gradio

	* Built with `gr.Blocks` using the `Soft` theme and `Roboto Mono` font
	* Two tabs:

	* 🕉️ RamayanaGPT
	* 🕉️ GitaGPT
	* Users enter their Groq API key once; it's stored in `gr.State`
	* Upon authentication:

	* API key input and help accordion are hidden
	* Full chat interface is revealed (`gr.ChatInterface`)

	---

	### ⚙️ Technical Stack

	\| Component \| Technology \|
	\| --------------- \| ------------------------------------- \|
	\| Backend LLM \| Groq (LLaMA 3.1 8B via API) \|
	\| Embedding Model \| Hugging Face (`multilingual-e5-base`) \|
	\| Vector Store \| MongoDB Atlas Vector Search \|
	\| Vector Engine \| LlamaIndex VectorStoreIndex \|
	\| Prompt Engine \| LlamaIndex PromptTemplate \|
	\| Query Engine \| LlamaIndex Query Engine \|
	\| UI Framework \| Gradio (Blocks + ChatInterface) \|
	\| Deployment \| Python app using `app.py` \|

	---

	### ✅ Features Implemented

	* [x] Vector search using MongoDB Atlas

	* `ramayana_vector_index` for Valmiki Ramayana
	* `gita_vector_index` for Bhagavad Gita
	* [x] Hugging Face embedding (`e5-base`) integration
	* [x] API key input and session handling with `gr.State`
	* [x] LLM integration via Groq API
	* [x] Prompt templates customized for each scripture
	* [x] Tabbed interface for seamless switching between RamayanaGPT and GitaGPT
	* [x] Clean UX with collapsible Groq API key instructions
	* [x] Logging of each query with timestamp (for debugging/monitoring)

	"""

	groq_api_key = """
	### 🔑 How to Get a Groq API Key

	1. Go to [https://console.groq.com/keys](https://console.groq.com/keys)
	2. Log in or Sign Up for a Groq account.
	3. Click "API Keys" from the dashboard.
	4. Click "Create Key", name it, and generate.
	5. Copy the API key and store it securely.
	6. Paste the key into the RamayanaGPT app to start chatting.

	---

	⚠️ Don't share your API key. Revoke and regenerate if needed.
	"""

	RamayanaGPT='''
	## 🕉️ RamayanaGPT – Overview and Dataset Summary

	### 📖 Introduction

	RamayanaGPT is a RAG-based chatbot that draws upon the Valmiki Ramayana, the original Sanskrit epic, to answer user queries with reference to shlokas and their commentaries. It aims to offer precise, contextual, and respectful responses using advanced retrieval and generation technologies.

	### 🗂️ Dataset Structure

	The uploaded Ramayana dataset includes the following columns:

	\| Column \| Description \|
	\| ------------- \| ------------------------------------------------------------------------------ \|
	\| `kanda` \| One of the 7 books (kandas) of the Ramayana (e.g., Bala Kanda, Ayodhya Kanda). \|
	\| `sarga` \| The chapter number within each kanda. \|
	\| `shloka` \| The shloka (verse) number within the sarga. \|
	\| `shloka_text` \| Original Sanskrit verse. \|
	\| `explanation` \| English explanation or interpretation of the shloka. \|

	### 🔍 Example

	```text
	kanda: Bala Kanda
	sarga: 1
	shloka: 1
	shloka_text: तपस्स्वाध्यायनिरतं तपस्वी वाग्विदां वरम् ।
	explanation: Ascetic Valmiki enquired of Narada, preeminent among sages, who was engaged in penance and study of the Vedas.
	```

	### 💡 Insights

	* The data is well-structured with nearly 1,400+ records.
	* Each record reflects a deep philosophical or narrative moment from the epic.
	* Metadata (`kanda`, `sarga`, `shloka`) allows precise retrieval and organization.
	* Used for vector indexing and semantic retrieval.
	'''

	GitaGPT='''
	## 🕉️ GitaGPT – Overview and Dataset Summary

	### 📖 Introduction

	GitaGPT is a chatbot built to answer spiritual and philosophical questions using the Bhagavad Gita as its primary source. It references verses (slokas) directly from the Gita, delivering insights supported by both Sanskrit, Hindi, and English explanations.

	### 🗂️ Dataset Structure

	The uploaded Gita dataset contains the following fields:

	\| Column \| Description \|
	\| --------------------- \| --------------------------------------------------- \|
	\| `S.No.` \| Serial number of the verse. \|
	\| `Title` \| Title of the chapter (e.g., Arjuna's Vishada Yoga). \|
	\| `Chapter` \| Gita chapter number (e.g., Chapter 1). \|
	\| `Verse` \| Verse ID (e.g., Verse 1.1). \|
	\| `Sanskrit Anuvad` \| Original verse in Devanagari Sanskrit. \|
	\| `Hindi Anuvad` \| Hindi translation/interpretation. \|
	\| `Enlgish Translation` \| English translation/interpretation. \|

	### 🔍 Example

	```text
	Chapter: Chapter 1
	Verse: Verse 1.1
	Sanskrit: धृतराष्ट्र उवाच । धर्मक्षेत्रे कुरुक्षेत्रे समवेता युयुत्सवः...
	Hindi: धृतराष्ट्र बोले- हे संजय! धर्मभूमि कुरुक्षेत्र में एकत्र हुए युद्ध की इच्छा रखने वाले...
	English: Dhrtarashtra asked of Sanjaya: O SANJAYA, what did my sons and the sons of Pandu do?
	```

	### 💡 Insights

	* The dataset contains 700+ verses from all 18 chapters.
	* Multilingual representation (Sanskrit, Hindi, English) enhances usability for diverse users.
	* The verse structure (`Chapter`, `Verse`) aids in precise referencing and response generation.
	* Perfectly suited for semantic search via vector embeddings.
	'''

	footer = """
	<div style="background-color: #1d2938; color: white; padding: 10px; width: 100%; bottom: 0; left: 0; display: flex; justify-content: space-between; align-items: center; padding: .2rem 35px; box-sizing: border-box; font-size: 16px;">
	<div style="text-align: left;">
	<p style="margin: 0;">© 2025 </p>
	</div>
	<div style="text-align: center; flex-grow: 1;">
	<p style="margin: 0;"> This website is made with ❤ by SARATH CHANDRA</p>
	</div>
	<div class="social-links" style="display: flex; gap: 20px; justify-content: flex-end; align-items: center;">
	<a href="https://github.com/21bq1a4210" target="_blank" style="text-align: center;">
	<img src="data:image/png;base64,{}" alt="GitHub" width="40" height="40" style="display: block; margin: 0 auto;">
	<span style="font-size: 14px;">GitHub</span>
	</a>
	<a href="https://www.linkedin.com/in/sarath-chandra-bandreddi-07393b1aa/" target="_blank" style="text-align: center;">
	<img src="data:image/png;base64,{}" alt="LinkedIn" width="40" height="40" style="display: block; margin: 0 auto;">
	<span style="font-size: 14px;">LinkedIn</span>
	</a>
	<a href="https://21bq1a4210.github.io/MyPortfolio-/" target="_blank" style="text-align: center;">
	<img src="data:image/png;base64,{}" alt="Portfolio" width="40" height="40" style="display: block; margin-right: 40px;">
	<span style="font-size: 14px;">Portfolio</span>
	</a>
	</div>
	</div>
	"""