neavo
/

modern_bert_multilingual

Model card Files Files and versions

neavo commited on Jan 31

Commit

1a0b784

·

verified ·

1 Parent(s): 6cfea74

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ license: apache-2.0
 - Trained for approximately `100` hours on `L40*7` devices, with a training volume of about `60B` tokens.
 - Main training parameters:
   - Batch Size: 1792
-  - Learning Rate: 4e-05
   - Maximum Sequence Length: 512
   - Optimizer: adamw_torch
   - LR Scheduler: warmup_stable_decay

 - Trained for approximately `100` hours on `L40*7` devices, with a training volume of about `60B` tokens.
 - Main training parameters:
   - Batch Size: 1792
+  - Learning Rate: 5e-04
   - Maximum Sequence Length: 512
   - Optimizer: adamw_torch
   - LR Scheduler: warmup_stable_decay