vignesh2404/vignesh-gpt

Browse files

Files changed (5) hide show

README.md +24 -6
adapter_config.json +1 -1
adapter_model.safetensors +2 -2
runs/Dec03_17-22-20_decoders-HP-Z2-Tower-G9-Workstation-Desktop-PC/events.out.tfevents.1733226740.decoders-HP-Z2-Tower-G9-Workstation-Desktop-PC.242060.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: peft
 license: mit
-base_model: openai-community/gpt2
 tags:
 - generated_from_trainer
 model-index:
@@ -15,6 +15,8 @@ should probably proofread and complete it, then remove this comment. -->
 # shawgpt-ft
 This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
 ## Model description
@@ -39,16 +41,32 @@ The following hyperparameters were used during training:
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
-- optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
 - num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Framework versions
 - PEFT 0.13.2
-- Transformers 4.46.2
-- Pytorch 2.5.1+cu121
-- Datasets 3.1.0
-- Tokenizers 0.20.3

 ---
+base_model: openai-community/gpt2
 library_name: peft
 license: mit
 tags:
 - generated_from_trainer
 model-index:
 # shawgpt-ft
 This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 4.0473
 ## Model description
 - seed: 42
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
 - num_epochs: 10
 - mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 5.1305        | 0.9231 | 3    | 4.3292          |
+| 5.0569        | 1.8462 | 6    | 4.2858          |
+| 4.9684        | 2.7692 | 9    | 4.2404          |
+| 3.7013        | 4.0    | 13   | 4.1879          |
+| 4.8439        | 4.9231 | 16   | 4.1508          |
+| 4.7853        | 5.8462 | 19   | 4.1179          |
+| 4.7509        | 6.7692 | 22   | 4.0882          |
+| 3.5014        | 8.0    | 26   | 4.0591          |
+| 4.6408        | 8.9231 | 29   | 4.0487          |
+| 3.1714        | 9.2308 | 30   | 4.0473          |
 ### Framework versions
 - PEFT 0.13.2
+- Transformers 4.44.2
+- Pytorch 2.5.1+cu124
+- Datasets 3.0.1
+- Tokenizers 0.19.1

adapter_config.json CHANGED Viewed

@@ -20,7 +20,7 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "h.0.attn.c_attn"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "attn.c_attn"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b019c8521c3b74e6e18097aa3eb6c6b6446838fb6e3a3a76e3b90d767eda7359
-size 98592

 version https://git-lfs.github.com/spec/v1
+oid sha256:7699930acfcbb3f63e02ed2eec801f308352c1b7598181911c47f879615a93cd
+size 1182680

runs/Dec03_17-22-20_decoders-HP-Z2-Tower-G9-Workstation-Desktop-PC/events.out.tfevents.1733226740.decoders-HP-Z2-Tower-G9-Workstation-Desktop-PC.242060.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5b8e2da5908d988a7935d61f6876eb86071ae5b8ea495ba4fda0392abb7b24bf
+size 10272

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b363ce7e94227074311f4eda3c7bd8ea86b9eed0df97e22dbc48647915dc3eec
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:368acd6aa94c97d36e51a0616003ee5154600fa692ff01938ec9e416adfdfe4a
 size 5240