Upload 4 files

Browse files

Files changed (4) hide show

README.md +40 -40
adapter_config.json +4 -4
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [beomi/gemma-ko-2b](https://huggingface.co/beomi/gemma-ko-2b) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2933
 ## Model description
@@ -35,7 +35,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-06
 - train_batch_size: 2
 - eval_batch_size: 2
 - seed: 42
@@ -51,44 +51,44 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.5845        | 0.26  | 20   | 1.5589          |
-| 1.4251        | 0.53  | 40   | 1.3136          |
-| 1.161         | 0.79  | 60   | 0.9521          |
-| 0.8854        | 1.05  | 80   | 0.7201          |
-| 0.6741        | 1.32  | 100  | 0.5536          |
-| 0.5857        | 1.58  | 120  | 0.4771          |
-| 0.5005        | 1.84  | 140  | 0.4275          |
-| 0.4307        | 2.11  | 160  | 0.3931          |
-| 0.4336        | 2.37  | 180  | 0.3706          |
-| 0.4179        | 2.63  | 200  | 0.3568          |
-| 0.4112        | 2.89  | 220  | 0.3459          |
-| 0.3462        | 3.16  | 240  | 0.3372          |
-| 0.3858        | 3.42  | 260  | 0.3322          |
-| 0.36          | 3.68  | 280  | 0.3248          |
-| 0.3117        | 3.95  | 300  | 0.3203          |
-| 0.3458        | 4.21  | 320  | 0.3158          |
-| 0.3454        | 4.47  | 340  | 0.3128          |
-| 0.3515        | 4.74  | 360  | 0.3147          |
-| 0.3313        | 5.0   | 380  | 0.3076          |
-| 0.3199        | 5.26  | 400  | 0.3067          |
-| 0.3006        | 5.53  | 420  | 0.3048          |
-| 0.3009        | 5.79  | 440  | 0.3017          |
-| 0.3285        | 6.05  | 460  | 0.3008          |
-| 0.3195        | 6.32  | 480  | 0.2995          |
-| 0.3543        | 6.58  | 500  | 0.3049          |
-| 0.3084        | 6.84  | 520  | 0.3037          |
-| 0.2809        | 7.11  | 540  | 0.2970          |
-| 0.28          | 7.37  | 560  | 0.2954          |
-| 0.3184        | 7.63  | 580  | 0.3062          |
-| 0.3017        | 7.89  | 600  | 0.2963          |
-| 0.3137        | 8.16  | 620  | 0.2943          |
-| 0.3046        | 8.42  | 640  | 0.2945          |
-| 0.2915        | 8.68  | 660  | 0.2947          |
-| 0.2808        | 8.95  | 680  | 0.2935          |
-| 0.2949        | 9.21  | 700  | 0.2926          |
-| 0.3169        | 9.47  | 720  | 0.2938          |
-| 0.2865        | 9.74  | 740  | 0.2939          |
-| 0.2978        | 10.0  | 760  | 0.2933          |
 ### Framework versions

 This model is a fine-tuned version of [beomi/gemma-ko-2b](https://huggingface.co/beomi/gemma-ko-2b) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4381
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
 - train_batch_size: 2
 - eval_batch_size: 2
 - seed: 42
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.3751        | 0.26  | 20   | 0.9839          |
+| 0.7078        | 0.53  | 40   | 0.5444          |
+| 0.4456        | 0.79  | 60   | 0.3339          |
+| 0.3351        | 1.05  | 80   | 0.3121          |
+| 0.3153        | 1.32  | 100  | 0.2869          |
+| 0.319         | 1.58  | 120  | 0.2829          |
+| 0.2942        | 1.84  | 140  | 0.2747          |
+| 0.2376        | 2.11  | 160  | 0.2624          |
+| 0.222         | 2.37  | 180  | 0.2828          |
+| 0.2228        | 2.63  | 200  | 0.2766          |
+| 0.2474        | 2.89  | 220  | 0.2596          |
+| 0.1617        | 3.16  | 240  | 0.2756          |
+| 0.1555        | 3.42  | 260  | 0.2814          |
+| 0.1426        | 3.68  | 280  | 0.2802          |
+| 0.1313        | 3.95  | 300  | 0.2846          |
+| 0.1086        | 4.21  | 320  | 0.3006          |
+| 0.0975        | 4.47  | 340  | 0.3048          |
+| 0.103         | 4.74  | 360  | 0.3009          |
+| 0.0891        | 5.0   | 380  | 0.2962          |
+| 0.0627        | 5.26  | 400  | 0.3190          |
+| 0.0487        | 5.53  | 420  | 0.3338          |
+| 0.0609        | 5.79  | 440  | 0.3345          |
+| 0.0559        | 6.05  | 460  | 0.3346          |
+| 0.0326        | 6.32  | 480  | 0.3618          |
+| 0.0363        | 6.58  | 500  | 0.3490          |
+| 0.0334        | 6.84  | 520  | 0.3548          |
+| 0.0277        | 7.11  | 540  | 0.3791          |
+| 0.0301        | 7.37  | 560  | 0.3836          |
+| 0.0188        | 7.63  | 580  | 0.3937          |
+| 0.0203        | 7.89  | 600  | 0.3914          |
+| 0.0173        | 8.16  | 620  | 0.4117          |
+| 0.0141        | 8.42  | 640  | 0.4143          |
+| 0.0131        | 8.68  | 660  | 0.4305          |
+| 0.0141        | 8.95  | 680  | 0.4252          |
+| 0.0099        | 9.21  | 700  | 0.4335          |
+| 0.0096        | 9.47  | 720  | 0.4351          |
+| 0.008         | 9.74  | 740  | 0.4368          |
+| 0.008         | 10.0  | 760  | 0.4381          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,13 +19,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "gate_proj",
     "q_proj",
-    "k_proj",
-    "v_proj",
     "up_proj",
-    "o_proj",
-    "down_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "k_proj",
+    "o_proj",
     "gate_proj",
+    "down_proj",
     "q_proj",
     "up_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:155066eb381721c11813992beaa4864ea212fe32c684812ed9a7ed746a102140
 size 78480072

 version https://git-lfs.github.com/spec/v1
+oid sha256:0718e32e2d88816212e093a355ef3f46677ae2f2bcc18e637450c4d541f5d8a1
 size 78480072

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f51af84f9fcf0ff68705532a04f133a2401ec8d49d06a67f4acfcf64b8b48866
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:90bc184d5a05142346f40469aaefe1ed504f50c16c38b1616315d56d4992b333
 size 4920