distilbert-classn-LinearAlg-finetuned-pred-span-width-1
This model is a fine-tuned version of dslim/distilbert-NER on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.6355
- Accuracy: 0.8333
- F1: 0.8279
- Precision: 0.8395
- Recall: 0.8333
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|---|
4.9827 | 0.6849 | 50 | 2.4530 | 0.0794 | 0.0322 | 0.0241 | 0.0794 |
4.8833 | 1.3699 | 100 | 2.4244 | 0.1349 | 0.0995 | 0.1629 | 0.1349 |
4.8962 | 2.0548 | 150 | 2.3684 | 0.1667 | 0.1471 | 0.1805 | 0.1667 |
4.7378 | 2.7397 | 200 | 2.3036 | 0.1508 | 0.1232 | 0.1131 | 0.1508 |
4.6513 | 3.4247 | 250 | 2.1886 | 0.2857 | 0.2656 | 0.4506 | 0.2857 |
4.3777 | 4.1096 | 300 | 2.0145 | 0.4921 | 0.4836 | 0.5143 | 0.4921 |
3.8864 | 4.7945 | 350 | 1.7826 | 0.5794 | 0.5554 | 0.5942 | 0.5794 |
3.4765 | 5.4795 | 400 | 1.5355 | 0.6429 | 0.6406 | 0.6735 | 0.6429 |
2.8014 | 6.1644 | 450 | 1.2879 | 0.6984 | 0.6874 | 0.7489 | 0.6984 |
2.3351 | 6.8493 | 500 | 1.0944 | 0.7222 | 0.7204 | 0.7597 | 0.7222 |
1.7767 | 7.5342 | 550 | 0.9561 | 0.7460 | 0.7445 | 0.8037 | 0.7460 |
1.2538 | 8.2192 | 600 | 0.8251 | 0.8016 | 0.8005 | 0.8411 | 0.8016 |
1.0022 | 8.9041 | 650 | 0.7631 | 0.7778 | 0.7735 | 0.8086 | 0.7778 |
0.7967 | 9.5890 | 700 | 0.7150 | 0.8095 | 0.8040 | 0.8197 | 0.8095 |
0.4946 | 10.2740 | 750 | 0.6704 | 0.8333 | 0.8257 | 0.8550 | 0.8333 |
0.4141 | 10.9589 | 800 | 0.6866 | 0.8254 | 0.8200 | 0.8588 | 0.8254 |
0.3052 | 11.6438 | 850 | 0.6306 | 0.8333 | 0.8286 | 0.8429 | 0.8333 |
0.2255 | 12.3288 | 900 | 0.6289 | 0.8492 | 0.8440 | 0.8734 | 0.8492 |
0.1486 | 13.0137 | 950 | 0.6209 | 0.8492 | 0.8437 | 0.8728 | 0.8492 |
0.1185 | 13.6986 | 1000 | 0.6115 | 0.8413 | 0.8361 | 0.8625 | 0.8413 |
0.0625 | 14.3836 | 1050 | 0.6161 | 0.8333 | 0.8313 | 0.8548 | 0.8333 |
0.079 | 15.0685 | 1100 | 0.6013 | 0.8333 | 0.8281 | 0.8500 | 0.8333 |
0.0689 | 15.7534 | 1150 | 0.6424 | 0.8413 | 0.8357 | 0.8574 | 0.8413 |
0.0386 | 16.4384 | 1200 | 0.6263 | 0.8413 | 0.8360 | 0.8492 | 0.8413 |
0.0423 | 17.1233 | 1250 | 0.6345 | 0.8413 | 0.8358 | 0.8603 | 0.8413 |
0.048 | 17.8082 | 1300 | 0.6386 | 0.8413 | 0.8366 | 0.8567 | 0.8413 |
0.0223 | 18.4932 | 1350 | 0.6386 | 0.8413 | 0.8361 | 0.8611 | 0.8413 |
0.0443 | 19.1781 | 1400 | 0.6320 | 0.8413 | 0.8355 | 0.8492 | 0.8413 |
0.0291 | 19.8630 | 1450 | 0.6355 | 0.8333 | 0.8279 | 0.8395 | 0.8333 |
Framework versions
- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.3.1
- Tokenizers 0.21.0
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for Heather-Driver/distilbert-classn-LinearAlg-finetuned-pred-span-width-1
Base model
distilbert/distilbert-base-cased
Quantized
dslim/distilbert-NER