dishDecode / README.md
GoodML's picture
Update README.md
13c950e verified
---
title: DishDecode
emoji:
colorFrom: pink
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: It transforms unstructured recipe videos into structured
---
---
# 📝 **Flask Audio and YouTube Video Processing API**
A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction.
## 🚀 **Overview**
This API offers:
1. **Audio Processing**: Download and transcribe audio files.
2. **YouTube Transcription**: Extract transcripts from YouTube videos.
3. **Recipe Data Generation**: Generate detailed recipe data using Gemini API.
---
## 🧩 **Features**
- **Audio URL Processing**: Download and transcribe audio via Deepgram.
- **YouTube Video Processing**: Extract video transcripts and process them.
- **Structured Output**: Recipe name, ingredients, steps, techniques, and more.
- **Logging and Error Handling**: Debugging and comprehensive error responses.
---
## ⚙️ **Installation**
### 1. Clone the Repository
```bash
git clone https://github.com/your-repo-name.git
cd your-repo-name
```
### 2. Create a Virtual Environment
```bash
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
```
### 3. Install Dependencies
```bash
pip install -r requirements.txt
```
### 4. Configure Environment Variables
Create a `.env` file with your API keys:
```plaintext
FIRST_API_KEY=your_gemini_api_key
SECOND_API_KEY=your_deepgram_api_key
```
---
## 📡 **API Endpoints**
### ✅ Health Check
**GET /**
```json
{
"status": "success",
"message": "API is running successfully!"
}
```
### 🎧 Process Audio URL
**POST /process-audio**
- **Request**:
```json
{
"audioUrl": "https://example.com/audio.wav"
}
```
- **Response**:
```json
{
"structured_data": {
"Recipe Name": "Pasta Alfredo",
"Ingredients List": ["Pasta", "Cream", "Garlic"],
...
}
}
```
### 📹 Process YouTube Video
**POST /process-youtube**
- **Request**:
```json
{
"youtube_url": "https://www.youtube.com/watch?v=example"
}
```
- **Response**:
```json
{
"structured_data": {
"Recipe Name": "Grilled Cheese Sandwich",
"Ingredients List": ["Bread", "Cheese", "Butter"],
...
}
}
```
---
## 🛠️ **How It Works**
1. **Audio Processing**:
- Downloads the audio file.
- Transcribes using Deepgram.
- Sends the transcription to Gemini for structured data.
2. **YouTube Processing**:
- Extracts the video ID.
- Retrieves the transcript.
- Sends the transcript to Gemini for structured data.
---
## 📦 **Dependencies**
- **Flask**
- **Whisper**
- **Deepgram**
- **Google Gemini API**
- **YouTube Transcript API**
- **Requests**
- **Dotenv**
Install dependencies via:
```bash
pip install -r requirements.txt
```
---
## ▶️ **Run the Application**
1. **Activate Virtual Environment**:
```bash
source venv/bin/activate # Windows: venv\Scripts\activate
```
2. **Run the Flask App**:
```bash
python app.py
```
3. **Access**:
```
http://localhost:5000
```
---
## 🐞 **Error Handling**
- **API Key Errors**: Ensure `.env` contains valid API keys.
- **Invalid Input**: Returns 400 for missing URLs.
- **Transcription Errors**: Returns detailed error messages.
---
## 📝 **License**
MIT License.
---
## 👤 **Contributors**
- **Aniket**
---
💬 **Feedback or contributions?** Open an issue or submit a pull request!
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference