ke-lly
/

47015772_2

+---
+license: mit
+base_model: openai-community/gpt2
+tags:
+- trl
+- sft
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: '47015772_2'
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# 47015772_2
+This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6622
+- Accuracy: 0.0001
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1.41e-05
+- train_batch_size: 32
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 256
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- training_steps: 2000
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 1.0551        | 0.01  | 25   | 0.9816          | 0.0001   |
+| 0.8806        | 0.01  | 50   | 0.8239          | 0.0001   |
+| 0.8218        | 0.02  | 75   | 0.7796          | 0.0001   |
+| 0.7854        | 0.02  | 100  | 0.7578          | 0.0001   |
+| 0.7887        | 0.03  | 125  | 0.7440          | 0.0000   |
+| 0.7675        | 0.03  | 150  | 0.7347          | 0.0001   |
+| 0.7552        | 0.04  | 175  | 0.7275          | 0.0001   |
+| 0.7493        | 0.04  | 200  | 0.7230          | 0.0001   |
+| 0.7471        | 0.05  | 225  | 0.7187          | 0.0000   |
+| 0.7376        | 0.06  | 250  | 0.7147          | 0.0001   |
+| 0.7341        | 0.06  | 275  | 0.7119          | 0.0001   |
+| 0.7387        | 0.07  | 300  | 0.7091          | 0.0001   |
+| 0.7364        | 0.07  | 325  | 0.7069          | 0.0001   |
+| 0.7179        | 0.08  | 350  | 0.7039          | 0.0001   |
+| 0.7145        | 0.08  | 375  | 0.7016          | 0.0000   |
+| 0.7204        | 0.09  | 400  | 0.6989          | 0.0000   |
+| 0.7236        | 0.1   | 425  | 0.6963          | 0.0000   |
+| 0.7099        | 0.1   | 450  | 0.6944          | 0.0000   |
+| 0.7137        | 0.11  | 475  | 0.6927          | 0.0001   |
+| 0.7106        | 0.11  | 500  | 0.6900          | 0.0000   |
+| 0.7074        | 0.12  | 525  | 0.6884          | 0.0001   |
+| 0.7141        | 0.12  | 550  | 0.6871          | 0.0000   |
+| 0.7053        | 0.13  | 575  | 0.6856          | 0.0001   |
+| 0.7082        | 0.13  | 600  | 0.6849          | 0.0000   |
+| 0.704         | 0.14  | 625  | 0.6841          | 0.0001   |
+| 0.7057        | 0.15  | 650  | 0.6828          | 0.0001   |
+| 0.7027        | 0.15  | 675  | 0.6813          | 0.0001   |
+| 0.6951        | 0.16  | 700  | 0.6807          | 0.0001   |
+| 0.6943        | 0.16  | 725  | 0.6795          | 0.0001   |
+| 0.7063        | 0.17  | 750  | 0.6781          | 0.0001   |
+| 0.6888        | 0.17  | 775  | 0.6777          | 0.0001   |
+| 0.7           | 0.18  | 800  | 0.6769          | 0.0001   |
+| 0.6905        | 0.19  | 825  | 0.6762          | 0.0001   |
+| 0.7005        | 0.19  | 850  | 0.6756          | 0.0001   |
+| 0.6968        | 0.2   | 875  | 0.6747          | 0.0000   |
+| 0.6895        | 0.2   | 900  | 0.6743          | 0.0001   |
+| 0.6935        | 0.21  | 925  | 0.6732          | 0.0001   |
+| 0.7006        | 0.21  | 950  | 0.6727          | 0.0001   |
+| 0.6862        | 0.22  | 975  | 0.6722          | 0.0001   |
+| 0.6921        | 0.22  | 1000 | 0.6716          | 0.0001   |
+| 0.6875        | 0.23  | 1025 | 0.6710          | 0.0001   |
+| 0.6971        | 0.24  | 1050 | 0.6705          | 0.0001   |
+| 0.692         | 0.24  | 1075 | 0.6702          | 0.0001   |
+| 0.686         | 0.25  | 1100 | 0.6693          | 0.0001   |
+| 0.6876        | 0.25  | 1125 | 0.6692          | 0.0001   |
+| 0.6834        | 0.26  | 1150 | 0.6686          | 0.0001   |
+| 0.7024        | 0.26  | 1175 | 0.6683          | 0.0001   |
+| 0.6931        | 0.27  | 1200 | 0.6678          | 0.0001   |
+| 0.6764        | 0.28  | 1225 | 0.6677          | 0.0001   |
+| 0.6914        | 0.28  | 1250 | 0.6672          | 0.0001   |
+| 0.6916        | 0.29  | 1275 | 0.6667          | 0.0001   |
+| 0.6808        | 0.29  | 1300 | 0.6665          | 0.0001   |
+| 0.6789        | 0.3   | 1325 | 0.6661          | 0.0001   |
+| 0.6803        | 0.3   | 1350 | 0.6659          | 0.0001   |
+| 0.6892        | 0.31  | 1375 | 0.6656          | 0.0001   |
+| 0.6749        | 0.31  | 1400 | 0.6654          | 0.0001   |
+| 0.6823        | 0.32  | 1425 | 0.6648          | 0.0001   |
+| 0.6827        | 0.33  | 1450 | 0.6648          | 0.0001   |
+| 0.6826        | 0.33  | 1475 | 0.6647          | 0.0001   |
+| 0.6813        | 0.34  | 1500 | 0.6645          | 0.0001   |
+| 0.6864        | 0.34  | 1525 | 0.6641          | 0.0001   |
+| 0.6809        | 0.35  | 1550 | 0.6639          | 0.0001   |
+| 0.6779        | 0.35  | 1575 | 0.6639          | 0.0001   |
+| 0.687         | 0.36  | 1600 | 0.6635          | 0.0001   |
+| 0.6824        | 0.37  | 1625 | 0.6635          | 0.0001   |
+| 0.6816        | 0.37  | 1650 | 0.6631          | 0.0001   |
+| 0.6773        | 0.38  | 1675 | 0.6632          | 0.0001   |
+| 0.6788        | 0.38  | 1700 | 0.6630          | 0.0001   |
+| 0.6739        | 0.39  | 1725 | 0.6630          | 0.0001   |
+| 0.6773        | 0.39  | 1750 | 0.6629          | 0.0001   |
+| 0.6742        | 0.4   | 1775 | 0.6628          | 0.0001   |
+| 0.6797        | 0.4   | 1800 | 0.6626          | 0.0001   |
+| 0.6861        | 0.41  | 1825 | 0.6627          | 0.0001   |
+| 0.686         | 0.42  | 1850 | 0.6625          | 0.0001   |
+| 0.6711        | 0.42  | 1875 | 0.6624          | 0.0001   |
+| 0.6773        | 0.43  | 1900 | 0.6624          | 0.0001   |
+| 0.6844        | 0.43  | 1925 | 0.6623          | 0.0001   |
+| 0.6809        | 0.44  | 1950 | 0.6622          | 0.0001   |
+| 0.6766        | 0.44  | 1975 | 0.6622          | 0.0001   |
+| 0.6733        | 0.45  | 2000 | 0.6622          | 0.0001   |
+### Framework versions
+- Transformers 4.37.0
+- Pytorch 2.0.0+cu118
+- Datasets 2.16.1
+- Tokenizers 0.15.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.37.0"
+}