---
title: Amanu
emoji: 👁
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 3.44.4
app_file: app.py
pinned: false
---

# This repo's goal is to support the transcription and annotation of audios. 

## Parts

- `audio.py`: Everything related to audio preprocessing and analysis. 
- `transcription.py`: All code for transcript audios using fast-whisper. 
- `diarization.py`: Everything related to pyannotation. 
- `textformatting.py`: All related to fomatting the text in specific outputs. 

## UI parts

1. Transcription. 
2. Diarization.
3. Revision.
4. Output formatting. 

## How to access to the service?

The user will logging using a password and user specified by me. That user and password will be manually managed by me. 

## Pricing

1. Calculate the fixed cost of a server running for a long period of time. 
2. Check if I can use the hibernation period to save some money. 

## Development

- [x] Add word time-stamp
- [x] Add Accuracy at word level
- [ ] Add mel spectrogram?
- [ ] Add Whisper parameters to the interface
- [x] Add Whisper X  
- [x] Introduce SRT as output
- [x] Obtain txt with Diarization.
- [x] Obtain plain txt with segments. 
- [ ] Introduce POS.
- [x] Optional Preprocessing
- [ ] Transcripcion box as the text being written.


Introduce Tab for analysis including POS. Maybe it would be great to have a visualizer with the timestamps and other features in Streamlit. Quizas correcciones. 

## Dev

I used huggingface lfs

```
git install lfs
```

```
huggingface-cli lfs-enable-largefiles .
```