ke-lly
/

47015772_3

+---
+license: mit
+base_model: openai-community/gpt2
+tags:
+- trl
+- sft
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: '47015772_3'
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# 47015772_3
+This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6606
+- Accuracy: 0.0000
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1.41e-05
+- train_batch_size: 32
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 256
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- training_steps: 2000
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 1.0942        | 0.0   | 25   | 0.9862          | 0.0001   |
+| 0.8794        | 0.0   | 50   | 0.8269          | 0.0001   |
+| 0.8241        | 0.0   | 75   | 0.7812          | 0.0001   |
+| 0.8085        | 0.0   | 100  | 0.7559          | 0.0001   |
+| 0.78          | 0.0   | 125  | 0.7417          | 0.0001   |
+| 0.7656        | 0.0   | 150  | 0.7316          | 0.0000   |
+| 0.759         | 0.0   | 175  | 0.7247          | 0.0000   |
+| 0.7558        | 0.0   | 200  | 0.7191          | 0.0000   |
+| 0.7434        | 0.0   | 225  | 0.7143          | 0.0000   |
+| 0.7419        | 0.0   | 250  | 0.7114          | 0.0000   |
+| 0.7383        | 0.0   | 275  | 0.7086          | 0.0000   |
+| 0.7395        | 0.0   | 300  | 0.7051          | 0.0000   |
+| 0.7324        | 0.0   | 325  | 0.7031          | 0.0000   |
+| 0.721         | 0.0   | 350  | 0.7008          | 0.0000   |
+| 0.7279        | 0.0   | 375  | 0.6982          | 0.0000   |
+| 0.7303        | 0.01  | 400  | 0.6959          | 0.0000   |
+| 0.7185        | 0.01  | 425  | 0.6932          | 0.0000   |
+| 0.7221        | 0.01  | 450  | 0.6914          | 0.0000   |
+| 0.6984        | 0.01  | 475  | 0.6887          | 0.0000   |
+| 0.7162        | 0.01  | 500  | 0.6875          | 0.0000   |
+| 0.712         | 0.01  | 525  | 0.6859          | 0.0000   |
+| 0.7057        | 0.01  | 550  | 0.6848          | 0.0000   |
+| 0.7058        | 0.01  | 575  | 0.6831          | 0.0000   |
+| 0.705         | 0.01  | 600  | 0.6820          | 0.0000   |
+| 0.6941        | 0.01  | 625  | 0.6810          | 0.0000   |
+| 0.6973        | 0.01  | 650  | 0.6798          | 0.0000   |
+| 0.7044        | 0.01  | 675  | 0.6784          | 0.0000   |
+| 0.6943        | 0.01  | 700  | 0.6775          | 0.0000   |
+| 0.6993        | 0.01  | 725  | 0.6766          | 0.0000   |
+| 0.698         | 0.01  | 750  | 0.6755          | 0.0000   |
+| 0.7054        | 0.01  | 775  | 0.6745          | 0.0000   |
+| 0.693         | 0.01  | 800  | 0.6743          | 0.0000   |
+| 0.6957        | 0.01  | 825  | 0.6732          | 0.0000   |
+| 0.7015        | 0.01  | 850  | 0.6725          | 0.0000   |
+| 0.6878        | 0.01  | 875  | 0.6719          | 0.0000   |
+| 0.6965        | 0.01  | 900  | 0.6715          | 0.0000   |
+| 0.6887        | 0.01  | 925  | 0.6712          | 0.0000   |
+| 0.6945        | 0.01  | 950  | 0.6704          | 0.0000   |
+| 0.6885        | 0.01  | 975  | 0.6695          | 0.0000   |
+| 0.6872        | 0.01  | 1000 | 0.6692          | 0.0000   |
+| 0.6908        | 0.01  | 1025 | 0.6688          | 0.0000   |
+| 0.6912        | 0.01  | 1050 | 0.6684          | 0.0000   |
+| 0.6729        | 0.01  | 1075 | 0.6676          | 0.0000   |
+| 0.6852        | 0.01  | 1100 | 0.6674          | 0.0000   |
+| 0.6869        | 0.01  | 1125 | 0.6669          | 0.0000   |
+| 0.6875        | 0.01  | 1150 | 0.6663          | 0.0000   |
+| 0.6884        | 0.02  | 1175 | 0.6659          | 0.0000   |
+| 0.6905        | 0.02  | 1200 | 0.6658          | 0.0000   |
+| 0.6817        | 0.02  | 1225 | 0.6653          | 0.0000   |
+| 0.6875        | 0.02  | 1250 | 0.6649          | 0.0000   |
+| 0.6897        | 0.02  | 1275 | 0.6644          | 0.0000   |
+| 0.6727        | 0.02  | 1300 | 0.6642          | 0.0000   |
+| 0.6859        | 0.02  | 1325 | 0.6640          | 0.0000   |
+| 0.6836        | 0.02  | 1350 | 0.6637          | 0.0000   |
+| 0.6788        | 0.02  | 1375 | 0.6636          | 0.0000   |
+| 0.6866        | 0.02  | 1400 | 0.6634          | 0.0000   |
+| 0.6847        | 0.02  | 1425 | 0.6632          | 0.0000   |
+| 0.6804        | 0.02  | 1450 | 0.6630          | 0.0000   |
+| 0.6861        | 0.02  | 1475 | 0.6628          | 0.0000   |
+| 0.6803        | 0.02  | 1500 | 0.6625          | 0.0000   |
+| 0.6829        | 0.02  | 1525 | 0.6622          | 0.0000   |
+| 0.6825        | 0.02  | 1550 | 0.6621          | 0.0000   |
+| 0.6892        | 0.02  | 1575 | 0.6620          | 0.0000   |
+| 0.6801        | 0.02  | 1600 | 0.6619          | 0.0000   |
+| 0.6811        | 0.02  | 1625 | 0.6616          | 0.0000   |
+| 0.6764        | 0.02  | 1650 | 0.6616          | 0.0000   |
+| 0.6698        | 0.02  | 1675 | 0.6613          | 0.0000   |
+| 0.6879        | 0.02  | 1700 | 0.6613          | 0.0000   |
+| 0.6704        | 0.02  | 1725 | 0.6612          | 0.0000   |
+| 0.6888        | 0.02  | 1750 | 0.6611          | 0.0000   |
+| 0.6735        | 0.02  | 1775 | 0.6611          | 0.0000   |
+| 0.6859        | 0.02  | 1800 | 0.6609          | 0.0000   |
+| 0.6827        | 0.02  | 1825 | 0.6610          | 0.0000   |
+| 0.6826        | 0.02  | 1850 | 0.6608          | 0.0000   |
+| 0.6779        | 0.02  | 1875 | 0.6608          | 0.0000   |
+| 0.6749        | 0.02  | 1900 | 0.6607          | 0.0000   |
+| 0.6791        | 0.02  | 1925 | 0.6607          | 0.0000   |
+| 0.6801        | 0.03  | 1950 | 0.6607          | 0.0000   |
+| 0.6921        | 0.03  | 1975 | 0.6606          | 0.0000   |
+| 0.6792        | 0.03  | 2000 | 0.6606          | 0.0000   |
+### Framework versions
+- Transformers 4.37.0
+- Pytorch 2.0.0+cu118
+- Datasets 2.16.1
+- Tokenizers 0.15.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.37.0"
+}