mdeberta-ner-ghtk-hirach_NER-first_1000_data-3090-15Nov

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0975
Accuracy: 0.9820
F1: 0.4359
Precision: 0.4857
Recall: 0.3953

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall
No log	1.0	250	0.0903	0.9825	0.0	0.0	0.0
0.1391	2.0	500	0.0941	0.9825	0.0	0.0	0.0
0.1391	3.0	750	0.0933	0.9825	0.0	0.0	0.0
0.075	4.0	1000	0.0924	0.9825	0.0	0.0	0.0
0.075	5.0	1250	0.0894	0.9825	0.0	0.0	0.0
0.0634	6.0	1500	0.0870	0.9825	0.0851	0.5	0.0465
0.0634	7.0	1750	0.0846	0.9820	0.0833	0.4	0.0465
0.0508	8.0	2000	0.0799	0.9825	0.1224	0.5	0.0698
0.0508	9.0	2250	0.0794	0.9829	0.125	0.6	0.0698
0.0394	10.0	2500	0.0793	0.9800	0.0755	0.2	0.0465
0.0394	11.0	2750	0.0801	0.9808	0.2034	0.375	0.1395
0.0302	12.0	3000	0.0825	0.9812	0.2069	0.4	0.1395
0.0302	13.0	3250	0.0763	0.9829	0.2759	0.5333	0.1860
0.0232	14.0	3500	0.0755	0.9833	0.3692	0.5455	0.2791
0.0232	15.0	3750	0.0799	0.9829	0.3226	0.5263	0.2326
0.0176	16.0	4000	0.0785	0.9833	0.3692	0.5455	0.2791
0.0176	17.0	4250	0.0776	0.9825	0.3768	0.5	0.3023
0.0132	18.0	4500	0.0803	0.9833	0.3881	0.5417	0.3023
0.0132	19.0	4750	0.0826	0.9812	0.3611	0.4483	0.3023
0.0106	20.0	5000	0.0787	0.9825	0.4110	0.5	0.3488
0.0106	21.0	5250	0.0879	0.9816	0.3478	0.4615	0.2791
0.0085	22.0	5500	0.0848	0.9816	0.4156	0.4706	0.3721
0.0085	23.0	5750	0.0818	0.9825	0.4267	0.5	0.3721
0.0068	24.0	6000	0.0816	0.9833	0.4533	0.5312	0.3953
0.0068	25.0	6250	0.0819	0.9825	0.4267	0.5	0.3721
0.0056	26.0	6500	0.0848	0.9833	0.4533	0.5312	0.3953
0.0056	27.0	6750	0.0872	0.9833	0.4533	0.5312	0.3953
0.0049	28.0	7000	0.0844	0.9837	0.4595	0.5484	0.3953
0.0049	29.0	7250	0.0881	0.9820	0.4211	0.4848	0.3721
0.0042	30.0	7500	0.0925	0.9820	0.45	0.4865	0.4186
0.0042	31.0	7750	0.0924	0.9825	0.4267	0.5	0.3721
0.0038	32.0	8000	0.0938	0.9833	0.4675	0.5294	0.4186
0.0038	33.0	8250	0.0939	0.9825	0.4416	0.5	0.3953
0.0032	34.0	8500	0.0941	0.9833	0.4384	0.5333	0.3721
0.0032	35.0	8750	0.0942	0.9833	0.4675	0.5294	0.4186
0.0029	36.0	9000	0.0949	0.9820	0.4359	0.4857	0.3953
0.0029	37.0	9250	0.0961	0.9820	0.4359	0.4857	0.3953
0.0027	38.0	9500	0.0980	0.9820	0.4359	0.4857	0.3953
0.0027	39.0	9750	0.0972	0.9820	0.4359	0.4857	0.3953
0.0026	40.0	10000	0.0975	0.9820	0.4359	0.4857	0.3953

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.1.0
Tokenizers 0.19.1

quancute
/

mdeberta-ner-ghtk-hirach_NER-first_1000_data-3090-15Nov

mdeberta-ner-ghtk-hirach_NER-first_1000_data-3090-15Nov

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for quancute/mdeberta-ner-ghtk-hirach_NER-first_1000_data-3090-15Nov

Evaluation results