End of training

Files changed (6) hide show

README.md CHANGED Viewed

@@ -14,8 +14,6 @@ should probably proofread and complete it, then remove this comment. -->
 # trained_model
 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.9780
 ## Model description
@@ -34,23 +32,21 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0005811623641719214
-- train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 7
-- total_train_batch_size: 56
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 2
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.52  | 100  | 1.0140          |
-| No log        | 1.05  | 200  | 0.9792          |
-| No log        | 1.57  | 300  | 0.9780          |
 ### Framework versions

 # trained_model
 This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on an unknown dataset.
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0006741100367675095
+- train_batch_size: 52
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 5
+- total_train_batch_size: 260
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 4
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 2.43  | 100  | 0.9711          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -16,8 +16,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q",
-    "v"
   ],
   "task_type": "SEQ_2_SEQ_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v",
+    "q"
   ],
   "task_type": "SEQ_2_SEQ_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6d2554237ab2c5c5fcb57ac013a6288d5504accf79ebc68d638e29ad54e2bad2
 size 18915328

 version https://git-lfs.github.com/spec/v1
+oid sha256:4d1b2057d531f2a5429fcc79d54a8884c543e3f0e7359a624ad94a407a665485
 size 18915328

runs/Nov06_02-21-32_288b75d43cd4/events.out.tfevents.1699237305.288b75d43cd4.1909.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:b037751356eaea6c8c967fa225a104b9acef35af6d7a41205921ca7dc6b37409
+size 5394

trainer_state.json CHANGED Viewed

@@ -1,52 +1,36 @@
 {
   "best_metric": null,
   "best_model_checkpoint": null,
-  "epoch": 1.9970126960418222,
   "eval_steps": 100,
-  "global_step": 382,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
   "log_history": [
     {
-      "epoch": 0.52,
-      "eval_loss": 1.0140411853790283,
-      "eval_runtime": 1465.4694,
-      "eval_samples_per_second": 7.31,
-      "eval_steps_per_second": 0.914,
       "step": 100
     },
     {
-      "epoch": 1.05,
-      "eval_loss": 0.9791701436042786,
-      "eval_runtime": 1462.2287,
-      "eval_samples_per_second": 7.326,
-      "eval_steps_per_second": 0.916,
-      "step": 200
-    },
-    {
-      "epoch": 1.57,
-      "eval_loss": 0.9779573678970337,
-      "eval_runtime": 1465.9156,
-      "eval_samples_per_second": 7.308,
-      "eval_steps_per_second": 0.914,
-      "step": 300
-    },
-    {
-      "epoch": 2.0,
-      "step": 382,
-      "total_flos": 4.961368405337702e+16,
-      "train_loss": 0.8977117887966296,
-      "train_runtime": 14787.7789,
-      "train_samples_per_second": 1.449,
-      "train_steps_per_second": 0.026
     }
   ],
   "logging_steps": 500,
-  "max_steps": 382,
-  "num_train_epochs": 2,
   "save_steps": 500,
-  "total_flos": 4.961368405337702e+16,
   "trial_name": null,
   "trial_params": null
 }

 {
   "best_metric": null,
   "best_model_checkpoint": null,
+  "epoch": 3.9805825242718447,
   "eval_steps": 100,
+  "global_step": 164,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
   "log_history": [
     {
+      "epoch": 2.43,
+      "eval_loss": 0.9710711240768433,
+      "eval_runtime": 1487.3043,
+      "eval_samples_per_second": 7.203,
+      "eval_steps_per_second": 0.901,
       "step": 100
     },
     {
+      "epoch": 3.98,
+      "step": 164,
+      "total_flos": 9.889339416772608e+16,
+      "train_loss": 1.1787047967678164,
+      "train_runtime": 21689.2536,
+      "train_samples_per_second": 1.976,
+      "train_steps_per_second": 0.008
     }
   ],
   "logging_steps": 500,
+  "max_steps": 164,
+  "num_train_epochs": 4,
   "save_steps": 500,
+  "total_flos": 9.889339416772608e+16,
   "trial_name": null,
   "trial_params": null
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:78c9fe5062705c374b5b532489d2b6c07ba7a6c5eec5526fda5268bc5595c0a7
 size 4536

 version https://git-lfs.github.com/spec/v1
+oid sha256:3afdb64a91e6b71d89a1d36421804cef1fa3e521341ea328400dcf79b3449517
 size 4536