sorenmulli's picture
Update README.md
a6a8706 verified
---
library_name: transformers
license: openrail
datasets:
- alexandrainst/coral
language:
- da
metrics:
- wer
- cer
base_model:
- openai/whisper-large-v3
pipeline_tag: automatic-speech-recognition
model-index:
- name: coral-1-whisper-large
results:
- task:
type: automatic-speech-recognition
name: Automatic Speech Recognition
dataset:
name: CoRal read-aloud
type: alexandrainst/coral
split: test
args: read_aloud
metrics:
- type: cer
value: 4.3% ± 0.2%
name: CER
- type: wer
value: 10.4% ± 0.3%
name: WER
---
# Whisper-Large v.3 trained on CoRaL release 1
This is a Danish state-of-the-art speech recognition model, trained by [Alvenir](https://www.alvenir.ai/).
## Evaluation Results
| Model | Number of parameters | [CoRal](https://huggingface.co/datasets/alexandrainst/coral/viewer/read_aloud/test) CER | [CoRal](https://huggingface.co/datasets/alexandrainst/coral/viewer/read_aloud/test) WER |
|:---|---:|---:|---:|
| [Alvenir/coral-1-whisper-large](https://huggingface.co/Alvenir/coral-1-whisper-large) | 1540M | **4.3% ± 0.2%** | **10.4% ± 0.3%** |
| [alexandrainst/roest-315m](https://huggingface.co/alexandrainst/roest-315m) | 315M | 6.6% ± 0.2% | 17.0% ± 0.4% |
| [mhenrichsen/hviske-v2](https://huggingface.co/syvai/hviske-v2) | 1540M | 4.7% ± 0.07% | 11.8% ± 0.3% |
| [openai/whisper-large-v3](https://hf.co/openai/whisper-large-v3) | 1540M | 11.4% ± 0.3% | 28.3% ± 0.6% |
Results of more models and more datasets can be seen in the [model card for Røst-315m](https://huggingface.co/alexandrainst/roest-315m).
## Model details
This is simply the [Whisper Large v.3 model](https://hf.co/openai/whisper-large-v3) trained on the first release of [CoRaL data](https://huggingface.co/datasets/alexandrainst/coral).
The model was trained for 30K steps using the configuration from the [CoRaL repository](https://github.com/alexandrainst/coral) by running:
```py
python src/scripts/finetune_asr_model.py model=whisper-large max_steps=30000 model.learning_rate=1e-5
```
## License
Note that the dataset used is licensed under a custom license, adapted from OpenRAIL-M, which allows
commercial use with a few restrictions (speech synthesis and biometric identification).
See
[license](https://huggingface.co/Alvenir/coral-1-whisper-large/blob/main/LICENSE).
## Creators and Funders
The CoRal project is funded by the [Danish Innovation
Fund](https://innovationsfonden.dk/) and consists of the following partners:
- [Alexandra Institute](https://alexandra.dk/)
- [University of Copenhagen](https://www.ku.dk/)
- [Agency for Digital Government](https://digst.dk/)
- [Alvenir](https://www.alvenir.ai/)
- [Corti](https://www.corti.ai/)
We would like specifically thank Dan Saattrup Nielsen, Alexandra Institute for (among other things) the repository work and Simon Leminen Madsen, Alexandra Institute for modelling work.