Hubert-kakeiken-W-noisy-many

This model is a fine-tuned version of rinna/japanese-hubert-base on the ORIGINAL_KAKEIKEN_W_NOISY_MANY - JA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0326
  • Wer: 0.9988
  • Cer: 1.0151

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 12500
  • num_epochs: 40.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.89 1.0 5736 0.5165 0.9997 1.1059
0.2119 2.0 11472 0.0847 0.9995 1.0547
0.1575 3.0 17208 0.1106 0.9991 1.0276
0.1299 4.0 22944 0.0377 0.9989 1.0168
0.1172 5.0 28680 0.0405 0.9989 1.0170
0.1083 6.0 34416 0.0385 0.9989 1.0173
0.0975 7.0 40152 0.0425 0.9988 1.0173
0.0996 8.0 45888 0.0766 0.9988 1.0153
0.0945 9.0 51624 0.0307 0.9988 1.0166
0.0901 10.0 57360 0.0503 0.9988 1.0151
0.0842 11.0 63096 0.0223 0.9989 1.0166
0.0835 12.0 68832 0.0251 0.9989 1.0158
0.078 13.0 74568 0.0912 0.9989 1.0149
0.0734 14.0 80304 0.0314 0.9988 1.0159
0.0717 15.0 86040 0.0464 0.9989 1.0157
0.0687 16.0 91776 0.0254 0.9988 1.0156
0.0685 17.0 97512 0.0349 0.9988 1.0153
0.0694 18.0 103248 0.0206 0.9990 1.0153
0.0645 19.0 108984 0.0773 0.9989 1.0145
0.0634 20.0 114720 0.0301 0.9988 1.0154
0.059 21.0 120456 0.0277 0.9990 1.0157
0.0545 22.0 126192 0.0235 0.9988 1.0157
0.0522 23.0 131928 0.0382 0.9988 1.0152
0.0561 24.0 137664 0.0371 0.9988 1.0150
0.0554 25.0 143400 0.0265 0.9988 1.0155
0.0543 26.0 149136 0.0262 0.9988 1.0153
0.0464 27.0 154872 0.0289 0.9989 1.0155
0.045 28.0 160608 0.0303 0.9988 1.0153
0.0514 29.0 166344 0.0399 0.9989 1.0150
0.0462 30.0 172080 0.0325 0.9990 1.0153
0.0447 31.0 177816 0.0365 0.9988 1.0151
0.0399 32.0 183552 0.0355 0.9988 1.0150
0.0431 33.0 189288 0.0396 0.9989 1.0153
0.047 34.0 195024 0.0328 0.9988 1.0154
0.0384 35.0 200760 0.0321 0.9988 1.0151
0.0384 36.0 206496 0.0324 0.9988 1.0152
0.0389 37.0 212232 0.0321 0.9989 1.0151
0.0399 38.0 217968 0.0326 0.9988 1.0150
0.0392 39.0 223704 0.0325 0.9990 1.0153
0.0377 40.0 229440 0.0320 0.9988 1.0151

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
4
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for utakumi/Hubert-kakeiken-W-noisy-many

Finetuned
(54)
this model