CocoRoF commited on
Commit
2bc8d15
·
verified ·
1 Parent(s): 5bb1ba4

cc-100-done Done

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  This model was trained from scratch on the None dataset.
16
  It achieves the following results on the evaluation set:
17
- - Loss: 1.6233
18
 
19
  ## Model description
20
 
@@ -33,14 +33,14 @@ More information needed
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
- - learning_rate: 5e-07
37
- - train_batch_size: 32
38
  - eval_batch_size: 8
39
  - seed: 42
40
  - distributed_type: multi-GPU
41
  - num_devices: 8
42
  - gradient_accumulation_steps: 16
43
- - total_train_batch_size: 4096
44
  - total_eval_batch_size: 64
45
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
@@ -50,7 +50,9 @@ The following hyperparameters were used during training:
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:------:|:----:|:---------------:|
53
- | 25.4141 | 0.5424 | 2500 | 1.6233 |
 
 
54
 
55
 
56
  ### Framework versions
 
14
 
15
  This model was trained from scratch on the None dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Loss: 1.6150
18
 
19
  ## Model description
20
 
 
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
+ - learning_rate: 1e-07
37
+ - train_batch_size: 16
38
  - eval_batch_size: 8
39
  - seed: 42
40
  - distributed_type: multi-GPU
41
  - num_devices: 8
42
  - gradient_accumulation_steps: 16
43
+ - total_train_batch_size: 2048
44
  - total_eval_batch_size: 64
45
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:------:|:----:|:---------------:|
53
+ | 25.6476 | 0.2712 | 2500 | 1.6198 |
54
+ | 25.4537 | 0.5424 | 5000 | 1.6171 |
55
+ | 25.8678 | 0.8137 | 7500 | 1.6150 |
56
 
57
 
58
  ### Framework versions