Spaces:
Running
Running
title: AudioTranscriptSmolagentTool | |
emoji: 💬 | |
colorFrom: yellow | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 5.12.0 | |
app_file: app.py | |
pinned: false | |
license: apache-2.0 | |
short_description: smolagent tool to transcribe audio & video files | |
# TranscriptTool: A SmolAgent Tool for Audio/Video Transcription | |
## Overview | |
`TranscriptTool` is a SmolAgent tool designed to transcribe audio and video files into text. Leveraging OpenAI's Whisper and `ffmpeg`, this tool empowers agents to process multimedia inputs efficiently. It supports robust file handling, including format conversion to WAV, dynamic device selection (CPU or GPU), and easy use within smolagents via the Hugging Face API. | |
The repository contains three main components: | |
- **`transcription_tool.py`**: The core smolagent tool for transcription. | |
- **`app.py`**: A Gradio-powered web app to test and use the tool interactively. | |
- **`example_smolagent.py`**: Toy demonstration of how the tool operates within a smolagent framework. | |
--- | |
## Installation | |
1. Clone this repository: | |
```bash | |
git clone https://huggingface.co/spaces/maguid28/TranscriptTool | |
cd TranscriptTool | |
``` | |
2. Install dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
--- | |
## Usage | |
### Testing with Gradio (app.py) | |
To quickly test and use the transcription tool, run the provided Gradio app: | |
```bash | |
python app.py | |
``` | |
This launches a local Gradio interface. Upload an audio or video file to transcribe it directly. | |
### Running example SmolAgent (example_smolagent.py) | |
To see how TranscriptTool operates within a SmolAgent framework: | |
```bash | |
python example_smolagent.py | |
``` | |
### Access via Hugging Face API | |
The `TranscriptTool` is also available as a tool through the Hugging Face API. | |
#### How to Use the Tool via Hugging Face API | |
1. **Install SmolAgents** | |
Ensure you have the SmolAgents library installed: | |
```bash | |
pip install smolagents | |
``` | |
2. **Load the Tool from the Hugging Face Hub** | |
You can load the tool directly using the Hugging Face API. | |
```python | |
from smolagents import load_tool | |
transcription_tool = load_tool("maguid28/TranscriptTool", trust_remote_code=True) | |
``` | |
--- | |
## License | |
This project is licensed under the Apache-2.0 License. See the LICENSE file for more details. | |
--- | |
## Contributing | |
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes. |