SystemAdmin123 commited on
Commit
c1d9fe3
·
verified ·
1 Parent(s): 9397831

End of training

Browse files
Files changed (1) hide show
  1. README.md +15 -18
README.md CHANGED
@@ -36,7 +36,7 @@ datasets:
36
  system_prompt: ''
37
  device_map: auto
38
  eval_sample_packing: false
39
- eval_steps: 40
40
  flash_attention: true
41
  gradient_checkpointing: true
42
  group_by_length: true
@@ -54,7 +54,7 @@ output_dir: /root/.sn56/axolotl/tmp/tiny-random-LlamaForCausalLM
54
  pad_to_sequence_len: true
55
  resize_token_embeddings_to_32x: false
56
  sample_packing: true
57
- save_steps: 20
58
  save_total_limit: 2
59
  sequence_len: 2048
60
  tokenizer_type: LlamaTokenizerFast
@@ -77,7 +77,7 @@ warmup_ratio: 0.05
77
 
78
  This model is a fine-tuned version of [trl-internal-testing/tiny-random-LlamaForCausalLM](https://huggingface.co/trl-internal-testing/tiny-random-LlamaForCausalLM) on the argilla/databricks-dolly-15k-curated-en dataset.
79
  It achieves the following results on the evaluation set:
80
- - Loss: 9.1944
81
 
82
  ## Model description
83
 
@@ -114,21 +114,18 @@ The following hyperparameters were used during training:
114
  | Training Loss | Epoch | Step | Validation Loss |
115
  |:-------------:|:-------:|:----:|:---------------:|
116
  | No log | 0.0769 | 1 | 10.3764 |
117
- | 10.3522 | 3.0769 | 40 | 10.3366 |
118
- | 10.1177 | 6.1538 | 80 | 10.0885 |
119
- | 9.8887 | 9.2308 | 120 | 9.8677 |
120
- | 9.688 | 12.3077 | 160 | 9.6724 |
121
- | 9.5151 | 15.3846 | 200 | 9.5050 |
122
- | 9.3725 | 18.4615 | 240 | 9.3687 |
123
- | 9.2678 | 21.5385 | 280 | 9.2734 |
124
- | 9.2101 | 24.6154 | 320 | 9.2205 |
125
- | 9.186 | 27.6923 | 360 | 9.2018 |
126
- | 9.18 | 30.7692 | 400 | 9.1964 |
127
- | 9.1787 | 33.8462 | 440 | 9.1945 |
128
- | 9.1768 | 36.9231 | 480 | 9.1941 |
129
- | 9.1775 | 40.0 | 520 | 9.1938 |
130
- | 9.1784 | 43.0769 | 560 | 9.1949 |
131
- | 9.1762 | 46.1538 | 600 | 9.1944 |
132
 
133
 
134
  ### Framework versions
 
36
  system_prompt: ''
37
  device_map: auto
38
  eval_sample_packing: false
39
+ eval_steps: 50
40
  flash_attention: true
41
  gradient_checkpointing: true
42
  group_by_length: true
 
54
  pad_to_sequence_len: true
55
  resize_token_embeddings_to_32x: false
56
  sample_packing: true
57
+ save_steps: 50
58
  save_total_limit: 2
59
  sequence_len: 2048
60
  tokenizer_type: LlamaTokenizerFast
 
77
 
78
  This model is a fine-tuned version of [trl-internal-testing/tiny-random-LlamaForCausalLM](https://huggingface.co/trl-internal-testing/tiny-random-LlamaForCausalLM) on the argilla/databricks-dolly-15k-curated-en dataset.
79
  It achieves the following results on the evaluation set:
80
+ - Loss: 9.1943
81
 
82
  ## Model description
83
 
 
114
  | Training Loss | Epoch | Step | Validation Loss |
115
  |:-------------:|:-------:|:----:|:---------------:|
116
  | No log | 0.0769 | 1 | 10.3764 |
117
+ | 10.3159 | 3.8462 | 50 | 10.2852 |
118
+ | 9.998 | 7.6923 | 100 | 9.9738 |
119
+ | 9.7359 | 11.5385 | 150 | 9.7190 |
120
+ | 9.5151 | 15.3846 | 200 | 9.5042 |
121
+ | 9.3407 | 19.2308 | 250 | 9.3411 |
122
+ | 9.2338 | 23.0769 | 300 | 9.2415 |
123
+ | 9.1896 | 26.9231 | 350 | 9.2039 |
124
+ | 9.18 | 30.7692 | 400 | 9.1960 |
125
+ | 9.1777 | 34.6154 | 450 | 9.1957 |
126
+ | 9.1781 | 38.4615 | 500 | 9.1931 |
127
+ | 9.1761 | 42.3077 | 550 | 9.1936 |
128
+ | 9.1762 | 46.1538 | 600 | 9.1943 |
 
 
 
129
 
130
 
131
  ### Framework versions