Model save

Browse files

Files changed (5) hide show

README.md +3 -3
adapter_config.json +3 -3
adapter_model.safetensors +1 -1
logs/training_log.txt +19 -0
runs/Jan08_18-27-42_gpu-server/events.out.tfevents.1736361245.gpu-server.1036251.0 +2 -2

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.8343
 ## Model description
@@ -53,8 +53,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 1    | 1.9798          |
-| No log        | 2.0   | 2    | 1.8343          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.8331
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 1    | 1.9804          |
+| No log        | 2.0   | 2    | 1.8331          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -26,10 +26,10 @@
     "up_proj",
     "down_proj",
     "v_proj",
-    "gate_proj",
-    "o_proj",
     "q_proj",
-    "k_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

     "up_proj",
     "down_proj",
     "v_proj",
     "q_proj",
+    "k_proj",
+    "gate_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:08a13c9bcf2f18a7fc083503c10c21f34393d99111b5deca6147519caae65646
 size 1656902648

 version https://git-lfs.github.com/spec/v1
+oid sha256:b1a9d08cda6d175af2ff1ade011f87902ef2b31bc2b4d8d3961b23836aeab469
 size 1656902648

logs/training_log.txt CHANGED Viewed

@@ -37,3 +37,22 @@
 2025-01-08 18:44:59,225 - INFO - Model saved to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
 2025-01-08 18:45:05,105 - INFO - Saving Optimizer state to mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
 2025-01-08 18:45:11,104 - INFO - Optimizer state saved in mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0

 2025-01-08 18:44:59,225 - INFO - Model saved to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
 2025-01-08 18:45:05,105 - INFO - Saving Optimizer state to mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
 2025-01-08 18:45:11,104 - INFO - Optimizer state saved in mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
+2025-01-08 18:45:36,527 - INFO - Loss improved from 1.98041 to 1.83309
+2025-01-08 18:45:36,527 - INFO - Loss improved from 1.98041 to 1.83309
+2025-01-08 18:45:36,527 - INFO - Loss improved from 1.98041 to 1.83309
+2025-01-08 18:45:36,528 - INFO - Step 2/2 (100.0%), epoch: 2.0000, step_time: 501.73s, elapsed_time: 1073.05s
+2025-01-08 18:45:36,529 - INFO - Evaluation Results:
+  eval_loss: 1.8331
+  eval_runtime: 25.1685
+  eval_samples_per_second: 0.3180
+  eval_steps_per_second: 0.0790
+  epoch: 2.0000
+  elapsed_time: 1073.05s
+  step_time: 501.73s
+2025-01-08 18:45:36,529 - INFO - Loss improved from 1.98041 to 1.83309
+2025-01-08 18:48:59,163 - INFO - Saving model to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
+2025-01-08 18:49:02,615 - INFO - Model saved to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
+2025-01-08 18:49:08,850 - INFO - Saving Optimizer state to mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
+2025-01-08 18:49:15,280 - INFO - Optimizer state saved in mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
+2025-01-08 18:49:15,799 - INFO - Step 2/2 (100.0%), epoch: 2.0000, step_time: 219.27s, elapsed_time: 1292.32s
+2025-01-08 18:49:15,801 - INFO - Training completed in 1292.32 seconds

runs/Jan08_18-27-42_gpu-server/events.out.tfevents.1736361245.gpu-server.1036251.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:05f4fd9fe7a80d6e1c4c5a86f03cb20099d95e1a97be4d604cd0ff2fd23d5717
-size 5873

 version https://git-lfs.github.com/spec/v1
+oid sha256:23dda5dd66595f86c984462556a1003e2f338a7ffc3bf2c5606502ca86a2c3eb
+size 6487