Spaces:

GoodML
/

dishDecode

Running

App Files Files Community

dishDecode / README.md

GoodML

Update README.md

13c950e verified 7 months ago

preview code

raw

history blame contribute delete

3.67 kB

	---
	title: DishDecode
	emoji: ⚡
	colorFrom: pink
	colorTo: pink
	sdk: docker
	pinned: false
	license: mit
	short_description: It transforms unstructured recipe videos into structured
	---

	---

	# 📝 Flask Audio and YouTube Video Processing API

	A Flask-based API that processes audio files and YouTube video transcripts to generate structured recipe information. It utilizes Whisper, Deepgram, and Gemini APIs for transcription and data extraction.

	## 🚀 Overview

	This API offers:

	1. Audio Processing: Download and transcribe audio files.
	2. YouTube Transcription: Extract transcripts from YouTube videos.
	3. Recipe Data Generation: Generate detailed recipe data using Gemini API.

	---

	## 🧩 Features

	- Audio URL Processing: Download and transcribe audio via Deepgram.
	- YouTube Video Processing: Extract video transcripts and process them.
	- Structured Output: Recipe name, ingredients, steps, techniques, and more.
	- Logging and Error Handling: Debugging and comprehensive error responses.

	---

	## ⚙️ Installation

	### 1. Clone the Repository

	```bash
	git clone https://github.com/your-repo-name.git
	cd your-repo-name
	```

	### 2. Create a Virtual Environment

	```bash
	python -m venv venv
	source venv/bin/activate # Windows: venv\Scripts\activate
	```

	### 3. Install Dependencies

	```bash
	pip install -r requirements.txt
	```

	### 4. Configure Environment Variables

	Create a `.env` file with your API keys:

	```plaintext
	FIRST_API_KEY=your_gemini_api_key
	SECOND_API_KEY=your_deepgram_api_key
	```

	---

	## 📡 API Endpoints

	### ✅ Health Check

	GET /

	```json
	{
	"status": "success",
	"message": "API is running successfully!"
	}
	```

	### 🎧 Process Audio URL

	POST /process-audio

	- Request:

	```json
	{
	"audioUrl": "https://example.com/audio.wav"
	}
	```

	- Response:

	```json
	{
	"structured_data": {
	"Recipe Name": "Pasta Alfredo",
	"Ingredients List": ["Pasta", "Cream", "Garlic"],
	...
	}
	}
	```

	### 📹 Process YouTube Video

	POST /process-youtube

	- Request:

	```json
	{
	"youtube_url": "https://www.youtube.com/watch?v=example"
	}
	```

	- Response:

	```json
	{
	"structured_data": {
	"Recipe Name": "Grilled Cheese Sandwich",
	"Ingredients List": ["Bread", "Cheese", "Butter"],
	...
	}
	}
	```

	---

	## 🛠️ How It Works

	1. Audio Processing:
	- Downloads the audio file.
	- Transcribes using Deepgram.
	- Sends the transcription to Gemini for structured data.

	2. YouTube Processing:
	- Extracts the video ID.
	- Retrieves the transcript.
	- Sends the transcript to Gemini for structured data.

	---

	## 📦 Dependencies

	- Flask
	- Whisper
	- Deepgram
	- Google Gemini API
	- YouTube Transcript API
	- Requests
	- Dotenv

	Install dependencies via:

	```bash
	pip install -r requirements.txt
	```

	---

	## ▶️ Run the Application

	1. Activate Virtual Environment:

	```bash
	source venv/bin/activate # Windows: venv\Scripts\activate
	```

	2. Run the Flask App:

	```bash
	python app.py
	```

	3. Access:

	```
	http://localhost:5000
	```

	---

	## 🐞 Error Handling

	- API Key Errors: Ensure `.env` contains valid API keys.
	- Invalid Input: Returns 400 for missing URLs.
	- Transcription Errors: Returns detailed error messages.

	---

	## 📝 License

	MIT License.

	---

	## 👤 Contributors

	- Aniket

	---

	💬 Feedback or contributions? Open an issue or submit a pull request!



	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference