Ai-Marshal/5k_5epoch

Files changed (4) hide show

README.md CHANGED Viewed

@@ -19,6 +19,8 @@ should probably proofread and complete it, then remove this comment. -->
 # Mixtral_Alpace_v2
 This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the generator dataset.
 ## Model description
@@ -44,15 +46,22 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 0.03
-- training_steps: 1
 ### Training results
 ### Framework versions
-- PEFT 0.8.2
 - Transformers 4.38.1
 - Pytorch 2.2.1+cu121
 - Datasets 2.17.1

 # Mixtral_Alpace_v2
 This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4450
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 0.03
+- num_epochs: 5
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.5074        | 1.0   | 198  | 0.4895          |
+| 0.4477        | 2.0   | 396  | 0.4579          |
+| 0.4243        | 3.0   | 594  | 0.4498          |
+| 0.5102        | 4.0   | 792  | 0.4464          |
+| 0.4003        | 5.0   | 990  | 0.4450          |
 ### Framework versions
+- PEFT 0.9.0
 - Transformers 4.38.1
 - Pytorch 2.2.1+cu121
 - Datasets 2.17.1

adapter_config.json CHANGED Viewed

@@ -19,15 +19,16 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "down_proj",
     "k_proj",
-    "gate_proj",
     "q_proj",
-    "o_proj",
-    "v_proj",
     "up_proj",
-    "lm_head"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v_proj",
     "k_proj",
     "q_proj",
     "up_proj",
+    "lm_head",
+    "down_proj",
+    "gate_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
+  "use_dora": false,
   "use_rslora": false
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c3b3409edb1dbf2b32c10b1af5fb5d015b104c3d19b079c5dd3f7e35795c676c
-size 56875144

 version https://git-lfs.github.com/spec/v1
+oid sha256:39e10d29f4277e5bfcdfa414b4c79e7cff02dfec182fa1a6d72892f436a23fca
+size 581163872

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bcaaca2609d3d8e304562764b32ecca1ada368b9a9f89aa08774715ce259cba5
 size 4856

 version https://git-lfs.github.com/spec/v1
+oid sha256:822b1d19df4c767ef76685b8b6db5511c50ae1bedb392a51d35199926cfb8d5b
 size 4856