End of training

bcd7c90 verified over 1 year ago

3.84 kB

	---
	license: mit
	base_model: microsoft/deberta-v3-base
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: Intent-classification-DeBERTa-model-Ashuv2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Intent-classification-DeBERTa-model-Ashuv2

	This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1666
	- Accuracy: 0.9012

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 1.7819 \| 0.12 \| 10 \| 1.7969 \| 0.3665 \|
	\| 1.7316 \| 0.25 \| 20 \| 1.6351 \| 0.3975 \|
	\| 1.4279 \| 0.37 \| 30 \| 1.2845 \| 0.5776 \|
	\| 1.0181 \| 0.49 \| 40 \| 0.8974 \| 0.7143 \|
	\| 0.7285 \| 0.62 \| 50 \| 0.6361 \| 0.7640 \|
	\| 0.7265 \| 0.74 \| 60 \| 0.4886 \| 0.8509 \|
	\| 0.5108 \| 0.86 \| 70 \| 0.3599 \| 0.9006 \|
	\| 0.4413 \| 0.99 \| 80 \| 0.2510 \| 0.8944 \|
	\| 0.3556 \| 1.11 \| 90 \| 0.2156 \| 0.9130 \|
	\| 0.2884 \| 1.23 \| 100 \| 0.2777 \| 0.8944 \|
	\| 0.1914 \| 1.36 \| 110 \| 0.2518 \| 0.8944 \|
	\| 0.5051 \| 1.48 \| 120 \| 0.2118 \| 0.9130 \|
	\| 0.1151 \| 1.6 \| 130 \| 0.1957 \| 0.9130 \|
	\| 0.1745 \| 1.73 \| 140 \| 0.2052 \| 0.8820 \|
	\| 0.1987 \| 1.85 \| 150 \| 0.2053 \| 0.8882 \|
	\| 0.2467 \| 1.98 \| 160 \| 0.1945 \| 0.8944 \|
	\| 0.3075 \| 2.1 \| 170 \| 0.2680 \| 0.8944 \|
	\| 0.1732 \| 2.22 \| 180 \| 0.2642 \| 0.8882 \|
	\| 0.1627 \| 2.35 \| 190 \| 0.1915 \| 0.9068 \|
	\| 0.1766 \| 2.47 \| 200 \| 0.1708 \| 0.9130 \|
	\| 0.2563 \| 2.59 \| 210 \| 0.1691 \| 0.8944 \|
	\| 0.189 \| 2.72 \| 220 \| 0.1941 \| 0.9130 \|
	\| 0.1696 \| 2.84 \| 230 \| 0.1907 \| 0.9130 \|
	\| 0.1865 \| 2.96 \| 240 \| 0.4247 \| 0.9130 \|
	\| 0.3183 \| 3.09 \| 250 \| 0.2251 \| 0.8944 \|
	\| 0.185 \| 3.21 \| 260 \| 0.2289 \| 0.8882 \|
	\| 0.1636 \| 3.33 \| 270 \| 0.1887 \| 0.8944 \|
	\| 0.2432 \| 3.46 \| 280 \| 0.2055 \| 0.8882 \|
	\| 0.1518 \| 3.58 \| 290 \| 0.2703 \| 0.8944 \|
	\| 0.2371 \| 3.7 \| 300 \| 0.2638 \| 0.8944 \|
	\| 0.1742 \| 3.83 \| 310 \| 0.2309 \| 0.8944 \|
	\| 0.2269 \| 3.95 \| 320 \| 0.2208 \| 0.8882 \|
	\| 0.1404 \| 4.07 \| 330 \| 0.2156 \| 0.8820 \|
	\| 0.1056 \| 4.2 \| 340 \| 0.2192 \| 0.9006 \|
	\| 0.164 \| 4.32 \| 350 \| 0.2282 \| 0.9068 \|
	\| 0.1419 \| 4.44 \| 360 \| 0.2380 \| 0.9068 \|
	\| 0.1164 \| 4.57 \| 370 \| 0.2438 \| 0.9006 \|
	\| 0.2167 \| 4.69 \| 380 \| 0.2429 \| 0.9006 \|
	\| 0.1244 \| 4.81 \| 390 \| 0.2363 \| 0.8820 \|
	\| 0.2341 \| 4.94 \| 400 \| 0.2344 \| 0.8882 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.1.2+cpu
	- Datasets 2.1.0
	- Tokenizers 0.15.2

	---
	license: mit
	base_model: microsoft/deberta-v3-base
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: Intent-classification-DeBERTa-model-Ashuv2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Intent-classification-DeBERTa-model-Ashuv2

	This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1666
	- Accuracy: 0.9012

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 1.7819 \| 0.12 \| 10 \| 1.7969 \| 0.3665 \|
	\| 1.7316 \| 0.25 \| 20 \| 1.6351 \| 0.3975 \|
	\| 1.4279 \| 0.37 \| 30 \| 1.2845 \| 0.5776 \|
	\| 1.0181 \| 0.49 \| 40 \| 0.8974 \| 0.7143 \|
	\| 0.7285 \| 0.62 \| 50 \| 0.6361 \| 0.7640 \|
	\| 0.7265 \| 0.74 \| 60 \| 0.4886 \| 0.8509 \|
	\| 0.5108 \| 0.86 \| 70 \| 0.3599 \| 0.9006 \|
	\| 0.4413 \| 0.99 \| 80 \| 0.2510 \| 0.8944 \|
	\| 0.3556 \| 1.11 \| 90 \| 0.2156 \| 0.9130 \|
	\| 0.2884 \| 1.23 \| 100 \| 0.2777 \| 0.8944 \|
	\| 0.1914 \| 1.36 \| 110 \| 0.2518 \| 0.8944 \|
	\| 0.5051 \| 1.48 \| 120 \| 0.2118 \| 0.9130 \|
	\| 0.1151 \| 1.6 \| 130 \| 0.1957 \| 0.9130 \|
	\| 0.1745 \| 1.73 \| 140 \| 0.2052 \| 0.8820 \|
	\| 0.1987 \| 1.85 \| 150 \| 0.2053 \| 0.8882 \|
	\| 0.2467 \| 1.98 \| 160 \| 0.1945 \| 0.8944 \|
	\| 0.3075 \| 2.1 \| 170 \| 0.2680 \| 0.8944 \|
	\| 0.1732 \| 2.22 \| 180 \| 0.2642 \| 0.8882 \|
	\| 0.1627 \| 2.35 \| 190 \| 0.1915 \| 0.9068 \|
	\| 0.1766 \| 2.47 \| 200 \| 0.1708 \| 0.9130 \|
	\| 0.2563 \| 2.59 \| 210 \| 0.1691 \| 0.8944 \|
	\| 0.189 \| 2.72 \| 220 \| 0.1941 \| 0.9130 \|
	\| 0.1696 \| 2.84 \| 230 \| 0.1907 \| 0.9130 \|
	\| 0.1865 \| 2.96 \| 240 \| 0.4247 \| 0.9130 \|
	\| 0.3183 \| 3.09 \| 250 \| 0.2251 \| 0.8944 \|
	\| 0.185 \| 3.21 \| 260 \| 0.2289 \| 0.8882 \|
	\| 0.1636 \| 3.33 \| 270 \| 0.1887 \| 0.8944 \|
	\| 0.2432 \| 3.46 \| 280 \| 0.2055 \| 0.8882 \|
	\| 0.1518 \| 3.58 \| 290 \| 0.2703 \| 0.8944 \|
	\| 0.2371 \| 3.7 \| 300 \| 0.2638 \| 0.8944 \|
	\| 0.1742 \| 3.83 \| 310 \| 0.2309 \| 0.8944 \|
	\| 0.2269 \| 3.95 \| 320 \| 0.2208 \| 0.8882 \|
	\| 0.1404 \| 4.07 \| 330 \| 0.2156 \| 0.8820 \|
	\| 0.1056 \| 4.2 \| 340 \| 0.2192 \| 0.9006 \|
	\| 0.164 \| 4.32 \| 350 \| 0.2282 \| 0.9068 \|
	\| 0.1419 \| 4.44 \| 360 \| 0.2380 \| 0.9068 \|
	\| 0.1164 \| 4.57 \| 370 \| 0.2438 \| 0.9006 \|
	\| 0.2167 \| 4.69 \| 380 \| 0.2429 \| 0.9006 \|
	\| 0.1244 \| 4.81 \| 390 \| 0.2363 \| 0.8820 \|
	\| 0.2341 \| 4.94 \| 400 \| 0.2344 \| 0.8882 \|


	### Framework versions

	- Transformers 4.38.2
	- Pytorch 2.1.2+cpu
	- Datasets 2.1.0
	- Tokenizers 0.15.2