metadata

library_name: transformers
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - bleu
model-index:
  - name: duo-predict-gpt2-medium-wikitext
    results: []

duo-predict-gpt2-medium-wikitext

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.5713
Accuracy: 0.0073
Perplexity: 4.8128
Bleu: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Perplexity	Bleu
7.6316	0.1403	500	3.7041	0.0073	40.6115	1.0
6.4879	0.2807	1000	3.1196	0.0073	22.6384	0.9995
5.3189	0.4210	1500	2.5976	0.0073	13.4321	1.0
4.7557	0.5613	2000	2.3369	0.0073	10.3487	1.0
4.351	0.7017	2500	2.1234	0.0073	8.3592	1.0
4.0701	0.8420	3000	2.0021	0.0073	7.4045	1.0
3.9112	0.9823	3500	1.9280	0.0073	6.8756	1.0
3.7775	1.1226	4000	1.8769	0.0073	6.5331	1.0
3.703	1.2630	4500	1.8369	0.0073	6.2768	1.0
3.6443	1.4033	5000	1.8061	0.0073	6.0866	1.0
3.5659	1.5436	5500	1.7789	0.0073	5.9232	1.0
3.5303	1.6840	6000	1.7552	0.0073	5.7848	1.0
3.4926	1.8243	6500	1.7348	0.0073	5.6676	1.0
3.464	1.9646	7000	1.7167	0.0073	5.5660	1.0
3.3432	2.1050	7500	1.7016	0.0073	5.4826	1.0
3.3215	2.2453	8000	1.6883	0.0073	5.4104	1.0
3.3122	2.3856	8500	1.6768	0.0073	5.3483	1.0
3.2836	2.5260	9000	1.6651	0.0073	5.2860	1.0
3.2582	2.6663	9500	1.6541	0.0073	5.2281	1.0
3.2387	2.8066	10000	1.6434	0.0073	5.1726	1.0
3.223	2.9470	10500	1.6338	0.0073	5.1232	1.0
3.126	3.0873	11000	1.6289	0.0073	5.0984	1.0
3.1149	3.2276	11500	1.6212	0.0073	5.0590	1.0
3.1048	3.3679	12000	1.6149	0.0073	5.0276	1.0
3.0966	3.5083	12500	1.6074	0.0073	4.9897	1.0
3.0904	3.6486	13000	1.6013	0.0073	4.9597	1.0
3.0979	3.7889	13500	1.5959	0.0073	4.9326	1.0
3.0772	3.9293	14000	1.5907	0.0073	4.9074	1.0
2.9795	4.0696	14500	1.5887	0.0073	4.8975	1.0
2.9807	4.2099	15000	1.5847	0.0073	4.8779	1.0
2.976	4.3503	15500	1.5817	0.0073	4.8634	1.0
2.9795	4.4906	16000	1.5778	0.0073	4.8444	1.0
2.9594	4.6309	16500	1.5754	0.0073	4.8328	1.0
2.9698	4.7713	17000	1.5731	0.0073	4.8217	1.0
2.9604	4.9116	17500	1.5713	0.0073	4.8128	1.0

Framework versions

Transformers 4.49.0
Pytorch 2.6.0+cu124
Datasets 3.3.2
Tokenizers 0.21.0