mjschock commited on
Commit
49474e2
·
verified ·
1 Parent(s): ba76b20

Model save

Browse files
Files changed (1) hide show
  1. README.md +16 -14
README.md CHANGED
@@ -18,20 +18,20 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # TinyLlama-1.1B-Chat-v1.0-sft-chat_threads
20
 
21
- This model is a fine-tuned version of [mjschock/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/mjschock/TinyLlama-1.1B-Chat-v1.0) on the mjschock/chat_threads dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 0.7796
24
- - Bleu: 0.6746
25
- - Precisions: 0.6878
26
- - Brevity Penalty: 0.9963
27
- - Length Ratio: 0.9965
28
- - Translation Length: 580.7962
29
  - Reference Length: 582.9104
30
- - Meteor: 0.6913
31
- - Rouge1: 0.7195
32
- - Rouge2: 0.4376
33
- - Rougel: 0.6323
34
- - Rougelsum: 0.7091
35
 
36
  ## Model description
37
 
@@ -58,14 +58,16 @@ The following hyperparameters were used during training:
58
  - total_train_batch_size: 16
59
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
60
  - lr_scheduler_type: linear
61
- - num_epochs: 1.0
62
 
63
  ### Training results
64
 
65
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Meteor | Rouge1 | Rouge2 | Rougel | Rougelsum |
66
  |:-------------:|:------:|:----:|:---------------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|:------:|:------:|:---------:|
67
  | No log | 0 | 0 | 0.8976 | 0.6391 | 0.6567 | 0.9934 | 0.9936 | 579.7720 | 582.9104 | 0.6775 | 0.6912 | 0.3881 | 0.5809 | 0.6813 |
68
- | 0.821 | 0.9630 | 13 | 0.7796 | 0.6746 | 0.6878 | 0.9963 | 0.9965 | 580.7962 | 582.9104 | 0.6913 | 0.7195 | 0.4376 | 0.6323 | 0.7091 |
 
 
69
 
70
 
71
  ### Framework versions
 
18
 
19
  # TinyLlama-1.1B-Chat-v1.0-sft-chat_threads
20
 
21
+ This model is a fine-tuned version of [mjschock/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/mjschock/TinyLlama-1.1B-Chat-v1.0) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 0.5567
24
+ - Bleu: 0.7600
25
+ - Precisions: 0.7673
26
+ - Brevity Penalty: 0.9978
27
+ - Length Ratio: 0.9985
28
+ - Translation Length: 582.4604
29
  - Reference Length: 582.9104
30
+ - Meteor: 0.7381
31
+ - Rouge1: 0.7955
32
+ - Rouge2: 0.5621
33
+ - Rougel: 0.7316
34
+ - Rougelsum: 0.7895
35
 
36
  ## Model description
37
 
 
58
  - total_train_batch_size: 16
59
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
60
  - lr_scheduler_type: linear
61
+ - num_epochs: 3.0
62
 
63
  ### Training results
64
 
65
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Meteor | Rouge1 | Rouge2 | Rougel | Rougelsum |
66
  |:-------------:|:------:|:----:|:---------------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|:------:|:------:|:---------:|
67
  | No log | 0 | 0 | 0.8976 | 0.6391 | 0.6567 | 0.9934 | 0.9936 | 579.7720 | 582.9104 | 0.6775 | 0.6912 | 0.3881 | 0.5809 | 0.6813 |
68
+ | 0.7597 | 0.9630 | 13 | 0.7155 | 0.6980 | 0.7093 | 0.9966 | 0.9968 | 581.1118 | 582.9104 | 0.7011 | 0.7382 | 0.4695 | 0.6634 | 0.7289 |
69
+ | 0.6302 | 2.0 | 27 | 0.5975 | 0.7435 | 0.7516 | 0.9977 | 0.9977 | 581.8998 | 582.9104 | 0.7320 | 0.7835 | 0.5336 | 0.7091 | 0.7776 |
70
+ | 0.5719 | 2.8889 | 39 | 0.5567 | 0.7600 | 0.7673 | 0.9978 | 0.9985 | 582.4604 | 582.9104 | 0.7381 | 0.7955 | 0.5621 | 0.7316 | 0.7895 |
71
 
72
 
73
  ### Framework versions