NbAiLab
/

nb-whisper-tiny

@@ -9,7 +9,7 @@ datasets:
 - NbAiLab/ncc_speech
 - NbAiLab/NST
 - NbAiLab/NPSC
-base_model: openai/whisper-small
 tags:
 - audio
 - asr
@@ -28,9 +28,9 @@ widget:
 ---
-# NB-Whisper Small
-Introducing the **_Norwegian NB-Whisper Small model_**, proudly developed by the National Library of Norway. NB-Whisper is a cutting-edge series of models designed for automatic speech recognition (ASR) and speech translation. These models are based on the work of [OpenAI's Whisper](https://arxiv.org/abs/2212.04356). Each model in the series has been trained for 250,000 steps, utilizing a diverse dataset of 8 million samples. These samples consist of aligned audio clips, each 30 seconds long, culminating in a staggering 66,000 hours of speech. For an in-depth understanding of our training methodology and dataset composition, keep an eye out for our upcoming article.
 | Model Size | Parameters | Model |
 |------------|------------|------------|
@@ -63,7 +63,7 @@ While the main models are suitable for most transcription task, we demonstrate h
 - **Model type:** `whisper`
 - **Language(s) (NLP):** Norwegian, Norwegian Bokmål, Norwegian Nynorsk, English
 - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
-- **Trained from model:** [openai/whisper-small](https://huggingface.co/openai/whisper-small)
 - **Code Repository:** https://github.com/NbAiLab/nb-whisper/
 - **Paper:** _Coming soon_
 - **Demo:** _See Spaces on this page_
@@ -91,7 +91,7 @@ After this is done, you should be able to run this in Python:
 from transformers import pipeline
 # Load the model
-asr = pipeline("automatic-speech-recognition", "NbAiLabBeta/nb-whisper-small")
 #transcribe
 asr("king.mp3", generate_kwargs={'task': 'transcribe', 'language': 'no'})
@@ -220,14 +220,14 @@ $ wget -N https://github.com/NbAiLab/nb-whisper/raw/main/audio/king.mp3
 $ ffmpeg -i king.mp3 -ar 16000 -ac 1 -c:a pcm_s16le king.wav
 # Lets download the two ggml-files from this site
-wget -N https://huggingface.co/NbAiLab/nb-whisper-small/resolve/main/ggml-model.bin -O models/nb-small-ggml-model.bin
-wget -N https://huggingface.co/NbAiLab/nb-whisper-small/resolve/main/ggml-model-q5_0.bin -O models/nb-small-ggml-model-q5_0.bin
 # And run it with the f16 default model
-$ ./main -l no -m models/nb-small-ggml-model.bin king.wav
 # Or the quantized version
-$ ./main -l no -m models/nb-small-ggml-model-q5_0.bin king.wav
 ```
 ### WhisperX and Speaker Diarization
@@ -247,7 +247,7 @@ wget -N https://github.com/NbAiLab/nb-whisper/raw/main/audio/knuthamsun.mp3
 pip uninstall whisperx && pip install git+https://github.com/m-bain/whisperx.git@8540ff5985fceee764acbed94f656063d7f56540
 # Transcribe the test file. All transcripts will end up in the directory of the mp3-file
-whisperx knuthamsun.mp3 --model NbAiLabBeta/nb-whisper-small --language no --diarize
 ```

 - NbAiLab/ncc_speech
 - NbAiLab/NST
 - NbAiLab/NPSC
+base_model: openai/whisper-tiny
 tags:
 - audio
 - asr
 ---
+# NB-Whisper Tiny
+Introducing the **_Norwegian NB-Whisper Tiny model_**, proudly developed by the National Library of Norway. NB-Whisper is a cutting-edge series of models designed for automatic speech recognition (ASR) and speech translation. These models are based on the work of [OpenAI's Whisper](https://arxiv.org/abs/2212.04356). Each model in the series has been trained for 250,000 steps, utilizing a diverse dataset of 8 million samples. These samples consist of aligned audio clips, each 30 seconds long, culminating in a staggering 66,000 hours of speech. For an in-depth understanding of our training methodology and dataset composition, keep an eye out for our upcoming article.
 | Model Size | Parameters | Model |
 |------------|------------|------------|
 - **Model type:** `whisper`
 - **Language(s) (NLP):** Norwegian, Norwegian Bokmål, Norwegian Nynorsk, English
 - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
+- **Trained from model:** [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
 - **Code Repository:** https://github.com/NbAiLab/nb-whisper/
 - **Paper:** _Coming soon_
 - **Demo:** _See Spaces on this page_
 from transformers import pipeline
 # Load the model
+asr = pipeline("automatic-speech-recognition", "NbAiLabBeta/nb-whisper-tiny")
 #transcribe
 asr("king.mp3", generate_kwargs={'task': 'transcribe', 'language': 'no'})
 $ ffmpeg -i king.mp3 -ar 16000 -ac 1 -c:a pcm_s16le king.wav
 # Lets download the two ggml-files from this site
+wget -N https://huggingface.co/NbAiLab/nb-whisper-tiny/resolve/main/ggml-model.bin -O models/nb-tiny-ggml-model.bin
+wget -N https://huggingface.co/NbAiLab/nb-whisper-tiny/resolve/main/ggml-model-q5_0.bin -O models/nb-tiny-ggml-model-q5_0.bin
 # And run it with the f16 default model
+$ ./main -l no -m models/nb-tiny-ggml-model.bin king.wav
 # Or the quantized version
+$ ./main -l no -m models/nb-tiny-ggml-model-q5_0.bin king.wav
 ```
 ### WhisperX and Speaker Diarization
 pip uninstall whisperx && pip install git+https://github.com/m-bain/whisperx.git@8540ff5985fceee764acbed94f656063d7f56540
 # Transcribe the test file. All transcripts will end up in the directory of the mp3-file
+whisperx knuthamsun.mp3 --model NbAiLabBeta/nb-whisper-tiny --language no --diarize
 ```