File size: 6,992 Bytes
dc87b43
 
 
a4e82f0
dc87b43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4e82f0
dc87b43
a4e82f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dc87b43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a4e82f0
 
 
dc87b43
 
a4e82f0
dc87b43
a4e82f0
dc87b43
a4e82f0
dc87b43
a4e82f0
dc87b43
 
 
a4e82f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dc87b43
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-large
tags:
- generated_from_trainer
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: modernbert-disfluency-optimized
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# modernbert-disfluency-optimized

This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0126
- Precision: 0.0827
- Recall: 0.4119
- F1: 0.1378
- Accuracy: 0.2107
- Artial Word F1: 0.0
- Artial Word Precision: 0.0
- Artial Word Recall: 0.0
- Ause F1: 0.6721
- Ause Precision: 0.5256
- Ause Recall: 0.9318
- Epetition F1: 0.0548
- Epetition Precision: 0.0350
- Epetition Recall: 0.1270
- Evision F1: 0.0079
- Evision Precision: 0.0042
- Evision Recall: 0.0833

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 24
- eval_batch_size: 48
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 48
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 15
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy | Artial Word F1 | Artial Word Precision | Artial Word Recall | Ause F1 | Ause Precision | Ause Recall | Epetition F1 | Epetition Precision | Epetition Recall | Evision F1 | Evision Precision | Evision Recall |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|:--------------:|:---------------------:|:------------------:|:-------:|:--------------:|:-----------:|:------------:|:-------------------:|:----------------:|:----------:|:-----------------:|:--------------:|
| 0.0513        | 1.0   | 58   | 0.0229          | 0.0192    | 0.1360 | 0.0336 | 0.1402   | 0.0            | 0.0                   | 0.0                | 0.0731  | 0.0422         | 0.2721      | 0.0223       | 0.0136              | 0.0619           | 0.0046     | 0.0025            | 0.0357         |
| 0.0524        | 2.0   | 116  | 0.0174          | 0.0391    | 0.2594 | 0.0680 | 0.1631   | 0.0            | 0.0                   | 0.0                | 0.1960  | 0.1172         | 0.5986      | 0.0229       | 0.0138              | 0.0670           | 0.0041     | 0.0022            | 0.0357         |
| 0.0463        | 3.0   | 174  | 0.0146          | 0.0501    | 0.3048 | 0.0861 | 0.1647   | 0.0            | 0.0                   | 0.0                | 0.3093  | 0.1974         | 0.7143      | 0.0245       | 0.0150              | 0.0670           | 0.0057     | 0.0030            | 0.0536         |
| 0.0166        | 4.0   | 232  | 0.0129          | 0.0593    | 0.3401 | 0.1010 | 0.1874   | 0.0            | 0.0                   | 0.0                | 0.4041  | 0.2708         | 0.7959      | 0.0286       | 0.0176              | 0.0773           | 0.0058     | 0.0030            | 0.0536         |
| 0.0707        | 5.0   | 290  | 0.0121          | 0.0651    | 0.3652 | 0.1106 | 0.1938   | 0.0            | 0.0                   | 0.0                | 0.5050  | 0.3580         | 0.8571      | 0.0302       | 0.0185              | 0.0825           | 0.0057     | 0.0030            | 0.0536         |
| 0.013         | 6.0   | 348  | 0.0115          | 0.0726    | 0.3829 | 0.1221 | 0.2020   | 0.0            | 0.0                   | 0.0                | 0.5586  | 0.4068         | 0.8912      | 0.0368       | 0.0230              | 0.0928           | 0.0058     | 0.0030            | 0.0536         |
| 0.032         | 7.0   | 406  | 0.0112          | 0.0788    | 0.4055 | 0.1320 | 0.1907   | 0.0            | 0.0                   | 0.0                | 0.6279  | 0.4770         | 0.9184      | 0.0434       | 0.0272              | 0.1082           | 0.0096     | 0.0051            | 0.0893         |
| 0.0293        | 8.0   | 464  | 0.0109          | 0.0789    | 0.4081 | 0.1322 | 0.2038   | 0.0            | 0.0                   | 0.0                | 0.6492  | 0.5            | 0.9252      | 0.0463       | 0.0288              | 0.1186           | 0.0058     | 0.0031            | 0.0536         |
| 0.0267        | 9.0   | 522  | 0.0107          | 0.0785    | 0.4055 | 0.1316 | 0.2049   | 0.0            | 0.0                   | 0.0                | 0.6667  | 0.5211         | 0.9252      | 0.0423       | 0.0263              | 0.1082           | 0.0077     | 0.0041            | 0.0714         |
| 0.0244        | 10.0  | 580  | 0.0106          | 0.0801    | 0.4106 | 0.1340 | 0.2026   | 0.0            | 0.0                   | 0.0                | 0.685   | 0.5415         | 0.9320      | 0.0448       | 0.0279              | 0.1134           | 0.0076     | 0.0040            | 0.0714         |
| 0.0104        | 11.0  | 638  | 0.0105          | 0.0818    | 0.4131 | 0.1366 | 0.2080   | 0.0            | 0.0                   | 0.0                | 0.6954  | 0.5547         | 0.9320      | 0.0477       | 0.0298              | 0.1186           | 0.0077     | 0.0041            | 0.0714         |
| 0.0352        | 12.0  | 696  | 0.0104          | 0.0836    | 0.4156 | 0.1392 | 0.2051   | 0.0            | 0.0                   | 0.0                | 0.7023  | 0.5610         | 0.9388      | 0.0487       | 0.0306              | 0.1186           | 0.0078     | 0.0041            | 0.0714         |
| 0.0216        | 13.0  | 754  | 0.0104          | 0.0827    | 0.4106 | 0.1376 | 0.2032   | 0.0            | 0.0                   | 0.0                | 0.7095  | 0.5702         | 0.9388      | 0.0443       | 0.0279              | 0.1082           | 0.0078     | 0.0041            | 0.0714         |
| 0.0211        | 14.0  | 812  | 0.0104          | 0.0828    | 0.4106 | 0.1378 | 0.2028   | 0.0            | 0.0                   | 0.0                | 0.7095  | 0.5702         | 0.9388      | 0.0443       | 0.0279              | 0.1082           | 0.0078     | 0.0041            | 0.0714         |
| 0.0208        | 15.0  | 870  | 0.0104          | 0.0828    | 0.4106 | 0.1378 | 0.2030   | 0.0            | 0.0                   | 0.0                | 0.7095  | 0.5702         | 0.9388      | 0.0444       | 0.0279              | 0.1082           | 0.0078     | 0.0041            | 0.0714         |


### Framework versions

- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.4.0
- Tokenizers 0.21.0