--- title: Amanu emoji: 👁 colorFrom: yellow colorTo: purple sdk: gradio sdk_version: 3.44.4 app_file: app.py pinned: false --- # This repo's goal is to support the transcription and annotation of audios. ## Parts - `audio.py`: Everything related to audio preprocessing and analysis. - `transcription.py`: All code for transcript audios using fast-whisper. - `diarization.py`: Everything related to pyannotation. - `textformatting.py`: All related to fomatting the text in specific outputs. ## UI parts 1. Transcription. 2. Diarization. 3. Revision. 4. Output formatting. ## How to access to the service? The user will logging using a password and user specified by me. That user and password will be manually managed by me. ## Pricing 1. Calculate the fixed cost of a server running for a long period of time. 2. Check if I can use the hibernation period to save some money. ## Development - [x] Add word time-stamp - [x] Add Accuracy at word level - [ ] Add mel spectrogram? - [ ] Add Whisper parameters to the interface - [x] Add Whisper X - [x] Introduce SRT as output - [x] Obtain txt with Diarization. - [x] Obtain plain txt with segments. - [ ] Introduce POS. - [x] Optional Preprocessing - [ ] Transcripcion box as the text being written. Introduce Tab for analysis including POS. Maybe it would be great to have a visualizer with the timestamps and other features in Streamlit. Quizas correcciones. ## Dev I used huggingface lfs ``` git install lfs ``` ``` huggingface-cli lfs-enable-largefiles . ```