sedrickkeh commited on
Commit
8528f4a
·
verified ·
1 Parent(s): a74468c

Model save

Browse files
Files changed (2) hide show
  1. README.md +152 -0
  2. generation_config.json +9 -0
README.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: llama3.1
4
+ base_model: meta-llama/Meta-Llama-3.1-8B
5
+ tags:
6
+ - llama-factory
7
+ - generated_from_trainer
8
+ model-index:
9
+ - name: oh_scale_x.125_compute_equal
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # oh_scale_x.125_compute_equal
17
+
18
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 2.0839
21
+
22
+ ## Model description
23
+
24
+ More information needed
25
+
26
+ ## Intended uses & limitations
27
+
28
+ More information needed
29
+
30
+ ## Training and evaluation data
31
+
32
+ More information needed
33
+
34
+ ## Training procedure
35
+
36
+ ### Training hyperparameters
37
+
38
+ The following hyperparameters were used during training:
39
+ - learning_rate: 5e-06
40
+ - train_batch_size: 8
41
+ - eval_batch_size: 8
42
+ - seed: 42
43
+ - distributed_type: multi-GPU
44
+ - num_devices: 8
45
+ - gradient_accumulation_steps: 8
46
+ - total_train_batch_size: 512
47
+ - total_eval_batch_size: 64
48
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
+ - lr_scheduler_type: constant
50
+ - num_epochs: 89.0
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss |
55
+ |:-------------:|:-------:|:----:|:---------------:|
56
+ | 0.8588 | 0.9973 | 47 | 0.8431 |
57
+ | 0.7685 | 1.9947 | 94 | 0.8078 |
58
+ | 0.7039 | 2.9920 | 141 | 0.8061 |
59
+ | 0.6431 | 3.9894 | 188 | 0.8146 |
60
+ | 0.6047 | 4.9867 | 235 | 0.8365 |
61
+ | 0.5574 | 5.9841 | 282 | 0.8701 |
62
+ | 0.5092 | 6.9814 | 329 | 0.8984 |
63
+ | 0.4572 | 8.0 | 377 | 0.9556 |
64
+ | 0.4085 | 8.9973 | 424 | 1.0193 |
65
+ | 0.349 | 9.9947 | 471 | 1.1014 |
66
+ | 0.2917 | 10.9920 | 518 | 1.1841 |
67
+ | 0.2371 | 11.9894 | 565 | 1.2766 |
68
+ | 0.1947 | 12.9867 | 612 | 1.4154 |
69
+ | 0.1574 | 13.9841 | 659 | 1.5165 |
70
+ | 0.1248 | 14.9814 | 706 | 1.6125 |
71
+ | 0.0949 | 16.0 | 754 | 1.7871 |
72
+ | 0.072 | 16.9973 | 801 | 1.8431 |
73
+ | 0.0557 | 17.9947 | 848 | 1.8931 |
74
+ | 0.0476 | 18.9920 | 895 | 1.8831 |
75
+ | 0.0389 | 19.9894 | 942 | 2.0265 |
76
+ | 0.0326 | 20.9867 | 989 | 2.0191 |
77
+ | 0.0289 | 21.9841 | 1036 | 2.0776 |
78
+ | 0.0241 | 22.9814 | 1083 | 2.1365 |
79
+ | 0.0224 | 24.0 | 1131 | 2.1633 |
80
+ | 0.0186 | 24.9973 | 1178 | 2.1493 |
81
+ | 0.0168 | 25.9947 | 1225 | 2.1881 |
82
+ | 0.0165 | 26.9920 | 1272 | 2.2118 |
83
+ | 0.0149 | 27.9894 | 1319 | 2.1890 |
84
+ | 0.0138 | 28.9867 | 1366 | 2.2228 |
85
+ | 0.0124 | 29.9841 | 1413 | 2.2381 |
86
+ | 0.0099 | 30.9814 | 1460 | 2.2632 |
87
+ | 0.0082 | 32.0 | 1508 | 2.3145 |
88
+ | 0.0074 | 32.9973 | 1555 | 2.3310 |
89
+ | 0.0063 | 33.9947 | 1602 | 2.2894 |
90
+ | 0.0058 | 34.9920 | 1649 | 2.3082 |
91
+ | 0.0051 | 35.9894 | 1696 | 2.3288 |
92
+ | 0.0048 | 36.9867 | 1743 | 2.3887 |
93
+ | 0.0047 | 37.9841 | 1790 | 2.3353 |
94
+ | 0.0046 | 38.9814 | 1837 | 2.3314 |
95
+ | 0.0046 | 40.0 | 1885 | 2.3529 |
96
+ | 0.0046 | 40.9973 | 1932 | 2.2960 |
97
+ | 0.0044 | 41.9947 | 1979 | 2.2470 |
98
+ | 0.0046 | 42.9920 | 2026 | 2.2445 |
99
+ | 0.0047 | 43.9894 | 2073 | 2.1857 |
100
+ | 0.0046 | 44.9867 | 2120 | 2.2821 |
101
+ | 0.0044 | 45.9841 | 2167 | 2.1947 |
102
+ | 0.0046 | 46.9814 | 2214 | 2.2448 |
103
+ | 0.0046 | 48.0 | 2262 | 2.2752 |
104
+ | 0.0045 | 48.9973 | 2309 | 2.1920 |
105
+ | 0.0043 | 49.9947 | 2356 | 2.2769 |
106
+ | 0.0046 | 50.9920 | 2403 | 2.1450 |
107
+ | 0.0047 | 51.9894 | 2450 | 2.1438 |
108
+ | 0.0045 | 52.9867 | 2497 | 2.2089 |
109
+ | 0.0046 | 53.9841 | 2544 | 2.1234 |
110
+ | 0.0043 | 54.9814 | 2591 | 2.0988 |
111
+ | 0.0042 | 56.0 | 2639 | 2.2262 |
112
+ | 0.0041 | 56.9973 | 2686 | 2.1830 |
113
+ | 0.0043 | 57.9947 | 2733 | 2.0565 |
114
+ | 0.0044 | 58.9920 | 2780 | 2.1350 |
115
+ | 0.0042 | 59.9894 | 2827 | 2.1475 |
116
+ | 0.004 | 60.9867 | 2874 | 2.1590 |
117
+ | 0.0039 | 61.9841 | 2921 | 2.1752 |
118
+ | 0.0043 | 62.9814 | 2968 | 2.0756 |
119
+ | 0.0038 | 64.0 | 3016 | 2.1629 |
120
+ | 0.0038 | 64.9973 | 3063 | 2.1522 |
121
+ | 0.0036 | 65.9947 | 3110 | 2.1449 |
122
+ | 0.0035 | 66.9920 | 3157 | 2.1889 |
123
+ | 0.0035 | 67.9894 | 3204 | 2.0248 |
124
+ | 0.0034 | 68.9867 | 3251 | 2.1538 |
125
+ | 0.0034 | 69.9841 | 3298 | 2.1202 |
126
+ | 0.0035 | 70.9814 | 3345 | 2.0326 |
127
+ | 0.0035 | 72.0 | 3393 | 2.1360 |
128
+ | 0.0036 | 72.9973 | 3440 | 2.1404 |
129
+ | 0.0036 | 73.9947 | 3487 | 2.0651 |
130
+ | 0.0035 | 74.9920 | 3534 | 2.0982 |
131
+ | 0.0033 | 75.9894 | 3581 | 2.1032 |
132
+ | 0.0034 | 76.9867 | 3628 | 2.1028 |
133
+ | 0.0032 | 77.9841 | 3675 | 2.1282 |
134
+ | 0.0031 | 78.9814 | 3722 | 2.0912 |
135
+ | 0.0035 | 80.0 | 3770 | 2.0766 |
136
+ | 0.0033 | 80.9973 | 3817 | 2.0286 |
137
+ | 0.0033 | 81.9947 | 3864 | 2.0421 |
138
+ | 0.0034 | 82.9920 | 3911 | 2.1121 |
139
+ | 0.0033 | 83.9894 | 3958 | 2.0832 |
140
+ | 0.0033 | 84.9867 | 4005 | 2.0629 |
141
+ | 0.0034 | 85.9841 | 4052 | 2.1398 |
142
+ | 0.0032 | 86.9814 | 4099 | 2.1203 |
143
+ | 0.0032 | 88.0 | 4147 | 2.1025 |
144
+ | 0.0035 | 88.7639 | 4183 | 2.0839 |
145
+
146
+
147
+ ### Framework versions
148
+
149
+ - Transformers 4.46.1
150
+ - Pytorch 2.3.0
151
+ - Datasets 3.1.0
152
+ - Tokenizers 0.20.3
generation_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 128000,
4
+ "do_sample": true,
5
+ "eos_token_id": 128001,
6
+ "temperature": 0.6,
7
+ "top_p": 0.9,
8
+ "transformers_version": "4.46.1"
9
+ }