metadata

language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_small_beta
    results: []

scream_small_beta

This model is a fine-tuned version of openai/whisper-small on the NbAiLab/ncc_speech dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
lr_scheduler_type: linear
per_device_train_batch_size: 32
total_train_batch_size_per_node: 128
total_train_batch_size: 1024
total_optimization_steps: 25,000
starting_optimization_step: None
finishing_optimization_step: 25,000
num_train_dataset_workers: 32
num_hosts: 8
total_num_training_examples: 25,600,000
steps_per_epoch: 6259
num_beams: None
dropout: True
bpe_dropout_probability: 0.1

Training results

step	validation_fleurs_loss	train_loss	validation_fleurs_wer	validation_fleurs_cer	validation_fleurs_exact_wer	validation_fleurs_exact_cer	validation_stortinget_loss	validation_stortinget_wer	validation_stortinget_cer	validation_stortinget_exact_wer	validation_stortinget_exact_cer	validation_nrk_tv_loss	validation_nrk_tv_wer	validation_nrk_tv_cer	validation_nrk_tv_exact_wer	validation_nrk_tv_exact_cer
0	1.2013	2.7117	32.3914	9.8343	35.7228	10.9398	1.4988	44.0673	22.9444	48.2612	24.2595	1.8165	79.9390	54.6020	89.7612	56.8482
1000	0.5796	1.0147	16.1214	5.2624	19.9821	6.2962	0.4822	22.0502	13.3652	25.7586	14.0827	1.0170	51.9187	37.4011	59.7853	39.0187
2000	0.4483	0.8851	12.4628	4.6064	16.2485	5.6101	0.3988	18.2903	11.9625	21.9050	12.6098	0.9032	46.8241	34.8122	55.1298	36.2314
3000	0.4130	0.8246	11.6002	4.7445	15.4122	5.7357	0.3602	16.9068	11.3683	20.4599	11.9897	0.8434	46.9972	35.4892	54.8885	36.8431
4000	0.3946	0.7897	10.2617	4.2365	14.4564	5.1703	0.3359	16.1132	11.0146	19.5868	11.6112	0.8112	44.8580	33.8810	52.6086	35.2519
5000	0.4532	0.7438	10.3807	4.2809	14.1876	5.2090	0.3295	15.7676	10.8729	19.2134	11.4603	0.8051	44.2068	33.3323	51.3438	34.6898
6000	0.4496	0.7275	10.1725	4.1182	13.9785	5.1075	0.3247	15.3487	10.6600	18.8008	11.2647	0.8003	43.8399	33.1808	51.5810	34.5430
7000	0.4061	0.7164	10.0535	4.4190	13.8292	5.3829	0.3183	15.0975	10.5465	18.5450	11.1334	0.7788	43.4813	33.2975	51.5227	34.6075
8000	0.3531	0.7066	9.4587	4.0590	13.2616	4.9915	0.3088	15.0711	10.5598	18.4922	11.1406	0.7575	43.4318	33.3995	51.1192	34.7187
9000	0.3529	0.6867	10.0833	4.2612	14.2174	5.3684	0.3107	14.8659	10.4674	18.3762	11.0681	0.7651	41.5811	31.9552	49.5507	33.2483
10000	0.4180	0.6707	9.3099	4.2711	13.0526	5.2621	0.3090	14.9093	10.4711	18.3745	11.0540	0.7626	42.2530	32.3733	49.9251	33.5916
11000	0.3910	0.6874	9.7561	4.4881	13.9188	5.5859	0.3046	15.1792	10.7022	18.5484	11.2682	0.7605	42.7847	32.9575	50.4368	34.1589
12000	0.4032	0.6411	9.9048	4.4239	13.5006	5.4409	0.3052	14.7986	10.3955	18.1349	10.9770	0.7578	42.5663	32.5781	50.1914	33.8385

Framework versions

Transformers 4.31.0.dev0
Datasets 2.13.0
Tokenizers 0.13.3