CocoRoF commited on
Commit
ba0e57c
·
verified ·
1 Parent(s): 6382050

cc-100-pro-16-18 Done

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -40,8 +40,8 @@ The following hyperparameters were used during training:
40
  - seed: 42
41
  - distributed_type: multi-GPU
42
  - num_devices: 8
43
- - gradient_accumulation_steps: 32
44
- - total_train_batch_size: 4096
45
  - total_eval_batch_size: 64
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
@@ -49,10 +49,13 @@ The following hyperparameters were used during training:
49
 
50
  ### Training results
51
 
52
- | Training Loss | Epoch | Step | Validation Loss |
53
- |:-------------:|:------:|:----:|:---------------:|
54
- | 50.521 | 0.3616 | 2500 | 1.5967 |
55
- | 0.0 | 0.7233 | 5000 | nan |
 
 
 
56
 
57
 
58
  ### Framework versions
 
40
  - seed: 42
41
  - distributed_type: multi-GPU
42
  - num_devices: 8
43
+ - gradient_accumulation_steps: 16
44
+ - total_train_batch_size: 2048
45
  - total_eval_batch_size: 64
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
 
49
 
50
  ### Training results
51
 
52
+ | Training Loss | Epoch | Step | Validation Loss |
53
+ |:-------------:|:------:|:-----:|:---------------:|
54
+ | 25.4012 | 0.1808 | 2500 | 1.6019 |
55
+ | 25.151 | 0.3616 | 5000 | 1.6015 |
56
+ | 25.0652 | 0.5424 | 7500 | 1.6032 |
57
+ | 24.9933 | 0.7233 | 10000 | 1.5984 |
58
+ | 0.0 | 0.9041 | 12500 | nan |
59
 
60
 
61
  ### Framework versions