Adil1567 commited on
Commit
efaeb13
·
verified ·
1 Parent(s): f4a5a66

Model save

Browse files
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 1.8343
22
 
23
  ## Model description
24
 
@@ -53,8 +53,8 @@ The following hyperparameters were used during training:
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:-----:|:----:|:---------------:|
56
- | No log | 1.0 | 1 | 1.9798 |
57
- | No log | 2.0 | 2 | 1.8343 |
58
 
59
 
60
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.8331
22
 
23
  ## Model description
24
 
 
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:-----:|:----:|:---------------:|
56
+ | No log | 1.0 | 1 | 1.9804 |
57
+ | No log | 2.0 | 2 | 1.8331 |
58
 
59
 
60
  ### Framework versions
adapter_config.json CHANGED
@@ -26,10 +26,10 @@
26
  "up_proj",
27
  "down_proj",
28
  "v_proj",
29
- "gate_proj",
30
- "o_proj",
31
  "q_proj",
32
- "k_proj"
 
 
33
  ],
34
  "task_type": "CAUSAL_LM",
35
  "use_dora": false,
 
26
  "up_proj",
27
  "down_proj",
28
  "v_proj",
 
 
29
  "q_proj",
30
+ "k_proj",
31
+ "gate_proj",
32
+ "o_proj"
33
  ],
34
  "task_type": "CAUSAL_LM",
35
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:08a13c9bcf2f18a7fc083503c10c21f34393d99111b5deca6147519caae65646
3
  size 1656902648
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1a9d08cda6d175af2ff1ade011f87902ef2b31bc2b4d8d3961b23836aeab469
3
  size 1656902648
logs/training_log.txt CHANGED
@@ -37,3 +37,22 @@
37
  2025-01-08 18:44:59,225 - INFO - Model saved to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
38
  2025-01-08 18:45:05,105 - INFO - Saving Optimizer state to mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
39
  2025-01-08 18:45:11,104 - INFO - Optimizer state saved in mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  2025-01-08 18:44:59,225 - INFO - Model saved to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
38
  2025-01-08 18:45:05,105 - INFO - Saving Optimizer state to mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
39
  2025-01-08 18:45:11,104 - INFO - Optimizer state saved in mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
40
+ 2025-01-08 18:45:36,527 - INFO - Loss improved from 1.98041 to 1.83309
41
+ 2025-01-08 18:45:36,527 - INFO - Loss improved from 1.98041 to 1.83309
42
+ 2025-01-08 18:45:36,527 - INFO - Loss improved from 1.98041 to 1.83309
43
+ 2025-01-08 18:45:36,528 - INFO - Step 2/2 (100.0%), epoch: 2.0000, step_time: 501.73s, elapsed_time: 1073.05s
44
+ 2025-01-08 18:45:36,529 - INFO - Evaluation Results:
45
+ eval_loss: 1.8331
46
+ eval_runtime: 25.1685
47
+ eval_samples_per_second: 0.3180
48
+ eval_steps_per_second: 0.0790
49
+ epoch: 2.0000
50
+ elapsed_time: 1073.05s
51
+ step_time: 501.73s
52
+ 2025-01-08 18:45:36,529 - INFO - Loss improved from 1.98041 to 1.83309
53
+ 2025-01-08 18:48:59,163 - INFO - Saving model to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
54
+ 2025-01-08 18:49:02,615 - INFO - Model saved to mistral-sft-lora-fsdp2/checkpoint-2/pytorch_model_fsdp_0
55
+ 2025-01-08 18:49:08,850 - INFO - Saving Optimizer state to mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
56
+ 2025-01-08 18:49:15,280 - INFO - Optimizer state saved in mistral-sft-lora-fsdp2/checkpoint-2/optimizer_0
57
+ 2025-01-08 18:49:15,799 - INFO - Step 2/2 (100.0%), epoch: 2.0000, step_time: 219.27s, elapsed_time: 1292.32s
58
+ 2025-01-08 18:49:15,801 - INFO - Training completed in 1292.32 seconds
runs/Jan08_18-27-42_gpu-server/events.out.tfevents.1736361245.gpu-server.1036251.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:05f4fd9fe7a80d6e1c4c5a86f03cb20099d95e1a97be4d604cd0ff2fd23d5717
3
- size 5873
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:23dda5dd66595f86c984462556a1003e2f338a7ffc3bf2c5606502ca86a2c3eb
3
+ size 6487