csb05 commited on
Commit
3a8ce2b
·
verified ·
1 Parent(s): e36e962

Model save

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -19,11 +19,11 @@ should probably proofread and complete it, then remove this comment. -->
19
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
  - Loss: nan
22
- - Rouge1: 5.1047
23
- - Rouge2: 1.1324
24
- - Rougel: 4.1953
25
- - Rougelsum: 4.1537
26
- - Gen Len: 15.8333
27
 
28
  ## Model description
29
 
@@ -43,25 +43,27 @@ More information needed
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 2e-05
46
- - train_batch_size: 4
47
- - eval_batch_size: 4
48
  - seed: 42
49
- - gradient_accumulation_steps: 2
50
  - total_train_batch_size: 8
51
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
52
  - lr_scheduler_type: linear
53
  - num_epochs: 1
 
54
 
55
  ### Training results
56
 
57
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
58
  |:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
59
- | 0.0 | 0.9811 | 26 | nan | 5.1047 | 1.1324 | 4.1953 | 4.1537 | 15.8333 |
60
 
61
 
62
  ### Framework versions
63
 
64
  - Transformers 4.47.1
65
- - Pytorch 2.6.0+cu124
66
  - Datasets 3.2.0
67
  - Tokenizers 0.21.0
 
19
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
  - Loss: nan
22
+ - Rouge1: 5.6456
23
+ - Rouge2: 1.2152
24
+ - Rougel: 4.5164
25
+ - Rougelsum: 4.5226
26
+ - Gen Len: 15.7143
27
 
28
  ## Model description
29
 
 
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 2e-05
46
+ - train_batch_size: 2
47
+ - eval_batch_size: 2
48
  - seed: 42
49
+ - gradient_accumulation_steps: 4
50
  - total_train_batch_size: 8
51
+ - optimizer: Use OptimizerNames.ADAFACTOR and the args are:
52
+ No additional optimizer arguments
53
  - lr_scheduler_type: linear
54
  - num_epochs: 1
55
+ - mixed_precision_training: Native AMP
56
 
57
  ### Training results
58
 
59
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
60
  |:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
61
+ | 0.0 | 0.9905 | 26 | nan | 5.6456 | 1.2152 | 4.5164 | 4.5226 | 15.7143 |
62
 
63
 
64
  ### Framework versions
65
 
66
  - Transformers 4.47.1
67
+ - Pytorch 2.5.1+cu124
68
  - Datasets 3.2.0
69
  - Tokenizers 0.21.0