Spaces:
Running
Running
title: DishDecode | |
emoji: ⚡ | |
colorFrom: pink | |
colorTo: pink | |
sdk: docker | |
pinned: false | |
license: mit | |
short_description: It transforms unstructured recipe videos into structured | |
--- | |
# 📝 **Flask Audio and YouTube Video Processing API** | |
A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction. | |
## 🚀 **Overview** | |
This API offers: | |
1. **Audio Processing**: Download and transcribe audio files. | |
2. **YouTube Transcription**: Extract transcripts from YouTube videos. | |
3. **Recipe Data Generation**: Generate detailed recipe data using Gemini API. | |
--- | |
## 🧩 **Features** | |
- **Audio URL Processing**: Download and transcribe audio via Deepgram. | |
- **YouTube Video Processing**: Extract video transcripts and process them. | |
- **Structured Output**: Recipe name, ingredients, steps, techniques, and more. | |
- **Logging and Error Handling**: Debugging and comprehensive error responses. | |
--- | |
## ⚙️ **Installation** | |
### 1. Clone the Repository | |
```bash | |
git clone https://github.com/your-repo-name.git | |
cd your-repo-name | |
``` | |
### 2. Create a Virtual Environment | |
```bash | |
python -m venv venv | |
source venv/bin/activate # Windows: venv\Scripts\activate | |
``` | |
### 3. Install Dependencies | |
```bash | |
pip install -r requirements.txt | |
``` | |
### 4. Configure Environment Variables | |
Create a `.env` file with your API keys: | |
```plaintext | |
FIRST_API_KEY=your_gemini_api_key | |
SECOND_API_KEY=your_deepgram_api_key | |
``` | |
--- | |
## 📡 **API Endpoints** | |
### ✅ Health Check | |
**GET /** | |
```json | |
{ | |
"status": "success", | |
"message": "API is running successfully!" | |
} | |
``` | |
### 🎧 Process Audio URL | |
**POST /process-audio** | |
- **Request**: | |
```json | |
{ | |
"audioUrl": "https://example.com/audio.wav" | |
} | |
``` | |
- **Response**: | |
```json | |
{ | |
"structured_data": { | |
"Recipe Name": "Pasta Alfredo", | |
"Ingredients List": ["Pasta", "Cream", "Garlic"], | |
... | |
} | |
} | |
``` | |
### 📹 Process YouTube Video | |
**POST /process-youtube** | |
- **Request**: | |
```json | |
{ | |
"youtube_url": "https://www.youtube.com/watch?v=example" | |
} | |
``` | |
- **Response**: | |
```json | |
{ | |
"structured_data": { | |
"Recipe Name": "Grilled Cheese Sandwich", | |
"Ingredients List": ["Bread", "Cheese", "Butter"], | |
... | |
} | |
} | |
``` | |
--- | |
## 🛠️ **How It Works** | |
1. **Audio Processing**: | |
- Downloads the audio file. | |
- Transcribes using Deepgram. | |
- Sends the transcription to Gemini for structured data. | |
2. **YouTube Processing**: | |
- Extracts the video ID. | |
- Retrieves the transcript. | |
- Sends the transcript to Gemini for structured data. | |
--- | |
## 📦 **Dependencies** | |
- **Flask** | |
- **Whisper** | |
- **Deepgram** | |
- **Google Gemini API** | |
- **YouTube Transcript API** | |
- **Requests** | |
- **Dotenv** | |
Install dependencies via: | |
```bash | |
pip install -r requirements.txt | |
``` | |
--- | |
## ▶️ **Run the Application** | |
1. **Activate Virtual Environment**: | |
```bash | |
source venv/bin/activate # Windows: venv\Scripts\activate | |
``` | |
2. **Run the Flask App**: | |
```bash | |
python app.py | |
``` | |
3. **Access**: | |
``` | |
http://localhost:5000 | |
``` | |
--- | |
## 🐞 **Error Handling** | |
- **API Key Errors**: Ensure `.env` contains valid API keys. | |
- **Invalid Input**: Returns 400 for missing URLs. | |
- **Transcription Errors**: Returns detailed error messages. | |
--- | |
## 📝 **License** | |
MIT License. | |
--- | |
## 👤 **Contributors** | |
- **Aniket** | |
--- | |
💬 **Feedback or contributions?** Open an issue or submit a pull request! | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |