lsmille
/

lora_evo_ta_all_layers_6

Generated from Trainer

Model card Files Files and versions Community

lsmille commited on May 28, 2024

Commit

272bf56

·

verified ·

1 Parent(s): 15cb3a1

Update README.md

Files changed (1) hide show

README.md +22 -2

README.md CHANGED Viewed

@@ -20,7 +20,27 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
@@ -28,7 +48,7 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 ## Model description
+lora_alpha = 32
+lora_dropout = 0.05
+lora_r = 16
+epochs = 3
+learning rate = 3e-4
+warmup_steps=0.5
+gradient_accumulation_steps = 8
+train_batch = 1
+eval_batch = 1
+Training of only last 40 linear modules [120:160] instead of [0:160] <------
+This changes the # of trainable params to 8,914,944
 ## Intended uses & limitations
 ## Training and evaluation data
+look at files
 ## Training procedure