mjschock
/

TinyLlama-1.1B-Chat-v1.0-sft-chat_threads

PEFT

Safetensors

trl

sft

Generated from Trainer

Model card Files Files and versions Community

mjschock commited on Nov 10, 2024

Commit

0ac4467

verified ·

1 Parent(s): b67b6c9

Model save

Browse files

Files changed (1) hide show

README.md +15 -15

README.md CHANGED Viewed

@@ -18,20 +18,20 @@ should probably proofread and complete it, then remove this comment. -->
 # TinyLlama-1.1B-Chat-v1.0-sft-chat_threads
-This model is a fine-tuned version of [mjschock/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/mjschock/TinyLlama-1.1B-Chat-v1.0) on the mjschock/chat_threads dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5610
-- Bleu: 0.7610
-- Precisions: 0.7680
-- Brevity Penalty: 0.9979
-- Length Ratio: 0.9983
-- Translation Length: 582.2670
 - Reference Length: 582.9104
-- Meteor: 0.7384
-- Rouge1: 0.7948
-- Rouge2: 0.5627
-- Rougel: 0.7303
-- Rougelsum: 0.7881
 ## Model description
@@ -65,9 +65,9 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Meteor | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:------:|:----:|:---------------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|:------:|:------:|:---------:|
 | No log        | 0      | 0    | 0.8976          | 0.6391 | 0.6567     | 0.9934          | 0.9936       | 579.7720           | 582.9104         | 0.6775 | 0.6912 | 0.3881 | 0.5809 | 0.6813    |
-| 0.7629        | 0.9630 | 13   | 0.7182          | 0.6966 | 0.7074     | 0.9976          | 0.9980       | 581.6733           | 582.9104         | 0.6974 | 0.7402 | 0.4682 | 0.6619 | 0.7309    |
-| 0.6338        | 2.0    | 27   | 0.6010          | 0.7477 | 0.7559     | 0.9972          | 0.9972       | 581.5872           | 582.9104         | 0.7316 | 0.7866 | 0.5410 | 0.7128 | 0.7804    |
-| 0.576         | 2.8889 | 39   | 0.5610          | 0.7610 | 0.7680     | 0.9979          | 0.9983       | 582.2670           | 582.9104         | 0.7384 | 0.7948 | 0.5627 | 0.7303 | 0.7881    |
 ### Framework versions

 # TinyLlama-1.1B-Chat-v1.0-sft-chat_threads
+This model is a fine-tuned version of [mjschock/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/mjschock/TinyLlama-1.1B-Chat-v1.0) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5586
+- Bleu: 0.7572
+- Precisions: 0.7641
+- Brevity Penalty: 0.9983
+- Length Ratio: 0.9986
+- Translation Length: 582.3552
 - Reference Length: 582.9104
+- Meteor: 0.7364
+- Rouge1: 0.7900
+- Rouge2: 0.5570
+- Rougel: 0.7250
+- Rougelsum: 0.7838
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Meteor | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:------:|:----:|:---------------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:------:|:------:|:------:|:------:|:---------:|
 | No log        | 0      | 0    | 0.8976          | 0.6391 | 0.6567     | 0.9934          | 0.9936       | 579.7720           | 582.9104         | 0.6775 | 0.6912 | 0.3881 | 0.5809 | 0.6813    |
+| 0.7612        | 0.9630 | 13   | 0.7168          | 0.6941 | 0.7056     | 0.9969          | 0.9973       | 581.2681           | 582.9104         | 0.7030 | 0.7375 | 0.4604 | 0.6572 | 0.7281    |
+| 0.6321        | 2.0    | 27   | 0.5992          | 0.7420 | 0.7498     | 0.9981          | 0.9981       | 582.0161           | 582.9104         | 0.7312 | 0.7780 | 0.5342 | 0.7069 | 0.7720    |
+| 0.5738        | 2.8889 | 39   | 0.5586          | 0.7572 | 0.7641     | 0.9983          | 0.9986       | 582.3552           | 582.9104         | 0.7364 | 0.7900 | 0.5570 | 0.7250 | 0.7838    |
 ### Framework versions