modernbert-binary-disfluency

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 48
eval_batch_size: 96
seed: 42
optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10
mixed_precision_training: Native AMP
label_smoothing_factor: 0.1

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1	Specificity	True Positives	False Positives	True Negatives	False Negatives
0.0036	1.7241	100	0.0031	0.6845	0.2814	0.9494	0.4342	0.6458	582	1486	2709	31
0.002	3.4483	200	0.0021	0.8184	0.4090	0.9527	0.5723	0.7988	584	844	3351	29
0.0012	5.1724	300	0.0019	0.8902	0.5398	0.9396	0.6857	0.8830	576	491	3704	37
0.0008	6.8966	400	0.0024	0.9289	0.6581	0.9201	0.7673	0.9302	564	293	3902	49
0.0005	8.6207	500	0.0029	0.9349	0.6829	0.9135	0.7816	0.9380	560	260	3935	53