Hubert-kakeiken-W-closed_add

This model is a fine-tuned version of rinna/japanese-hubert-base on the ORIGINAL_KAKEIKEN_W_CLOSED_ADD - JA dataset. It achieves the following results on the evaluation set:

Loss: 0.0298
Wer: 0.9988
Cer: 1.0164

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 12500
num_epochs: 40.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
11.3302	1.0	1060	10.0279	1.0	1.1284
7.0068	2.0	2120	5.3823	1.0	1.1283
3.682	3.0	3180	3.1894	1.0	1.1283
2.725	4.0	4240	2.1613	1.0	1.1284
1.4317	5.0	5300	0.7817	1.0	1.1232
0.5734	6.0	6360	0.2729	0.9991	1.0338
0.3485	7.0	7420	0.1662	0.9988	1.0259
0.2631	8.0	8480	0.0833	0.9988	1.0224
0.2068	9.0	9540	0.0616	0.9990	1.0210
0.1846	10.0	10600	0.0786	0.9988	1.0199
0.1792	11.0	11660	0.0472	0.9990	1.0198
0.1669	12.0	12720	0.0515	0.9988	1.0207
0.1609	13.0	13780	0.0529	0.9988	1.0219
0.1508	14.0	14840	0.0432	0.9988	1.0183
0.1427	15.0	15900	0.0860	0.9988	1.0148
0.1316	16.0	16960	0.0350	0.9988	1.0185
0.1296	17.0	18020	0.0449	0.9988	1.0191
0.1236	18.0	19080	0.0450	0.9988	1.0170
0.1149	19.0	20140	0.0358	0.9990	1.0181
0.1149	20.0	21200	0.0317	0.9990	1.0176
0.106	21.0	22260	0.0369	0.9988	1.0170
0.102	22.0	23320	0.0342	0.9988	1.0180
0.1011	23.0	24380	0.0411	0.9988	1.0179
0.0948	24.0	25440	0.0314	0.9988	1.0163
0.0934	25.0	26500	0.0302	0.9988	1.0175
0.0843	26.0	27560	0.0440	0.9988	1.0172
0.0833	27.0	28620	0.0341	0.9988	1.0167
0.0781	28.0	29680	0.0565	0.9988	1.0157
0.0741	29.0	30740	0.0357	0.9988	1.0161
0.0704	30.0	31800	0.0306	0.9988	1.0162
0.0678	31.0	32860	0.0289	0.9988	1.0159
0.067	32.0	33920	0.0279	0.9988	1.0159
0.0641	33.0	34980	0.0325	0.9988	1.0160
0.0595	34.0	36040	0.0330	0.9988	1.0166
0.055	35.0	37100	0.0309	0.9988	1.0164
0.055	36.0	38160	0.0303	0.9988	1.0167
0.0555	37.0	39220	0.0311	0.9988	1.0166
0.0533	38.0	40280	0.0311	0.9988	1.0166
0.0528	39.0	41340	0.0310	0.9988	1.0166
0.0509	39.9627	42360	0.0310	0.9988	1.0166

Framework versions

Transformers 4.48.0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.21.0

utakumi
/

Hubert-kakeiken-W-closed_add

Hubert-kakeiken-W-closed_add

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for utakumi/Hubert-kakeiken-W-closed_add

Evaluation results