Alepach commited on
Commit
e58ee9e
·
verified ·
1 Parent(s): 6f46a8f

Model save

Browse files
README.md CHANGED
@@ -6,31 +6,30 @@ tags:
6
  - generated_from_trainer
7
  - trl
8
  - sft
9
- license: apache-2.0
10
- datasets:
11
- - OpenAssistant/oasst1
12
  ---
13
 
14
- # notHumpback-M0
15
 
16
- This model follows the Humpback architecture, proposed in the paper [Self-Alignment with Instruction Backtranslation](https://arxiv.org/pdf/2308.06259)
17
- by Li et al.
18
 
19
- It represents the "seed model", which is trained on a small amount of gold data and then
20
- used to score the instruction-response pairs
21
- generated by the ["backward model"](https://huggingface.co/Alepach/notHumpback-Myx).
22
 
23
- Humpback uses instruction backtranslation on a web corpus to generate input-output pairs (self-augmentation),
24
- creating a richer dataset for fine-tuning models without the need for additional manual annotation.
25
- The model then iteratively curates the created dataset, scoring the pairs by quality, and is then finetuned on the resulting subset
26
- of all pairs with the highest possible score (self-curation).
 
 
 
 
 
 
27
 
28
- Varying from the original paper, this model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B).
29
- It has been trained using [TRL](https://github.com/huggingface/trl).
30
 
31
- The dataset used to train this model has been sampled from the [oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) dataset.
32
- To enable the model to judge and score the generated pairs, the model undergoes basic instruction-tuning on the input-output
33
- pairs contained in the dataset.
34
 
35
  ### Framework versions
36
 
@@ -42,18 +41,7 @@ pairs contained in the dataset.
42
 
43
  ## Citations
44
 
45
- Original paper:
46
 
47
- ```bibtex
48
- @misc{li2023selfalignment,
49
- title={Self-Alignment with Instruction Backtranslation},
50
- author={Xian Li and Ping Yu and Chunting Zhou and Timo Schick and Luke Zettlemoyer and Omer Levy and Jason Weston and Mike Lewis},
51
- year={2023},
52
- eprint={2308.06259},
53
- archivePrefix={arXiv},
54
- primaryClass={cs.CL}
55
- }
56
- ```
57
 
58
  Cite TRL as:
59
 
 
6
  - generated_from_trainer
7
  - trl
8
  - sft
9
+ licence: license
 
 
10
  ---
11
 
12
+ # Model Card for notHumpback-M0
13
 
14
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B).
15
+ It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
+ ## Quick start
 
 
18
 
19
+ ```python
20
+ from transformers import pipeline
21
+
22
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
23
+ generator = pipeline("text-generation", model="Alepach/notHumpback-M0", device="cuda")
24
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
25
+ print(output["generated_text"])
26
+ ```
27
+
28
+ ## Training procedure
29
 
 
 
30
 
31
+
32
+ This model was trained with SFT.
 
33
 
34
  ### Framework versions
35
 
 
41
 
42
  ## Citations
43
 
 
44
 
 
 
 
 
 
 
 
 
 
 
45
 
46
  Cite TRL as:
47
 
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7ffe3b2120490b45a390cc9048ce99f3602a62df7ad04f0b4755b183f57d5caa
3
  size 4965799096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d06535c06f6b2aad587adc2d53e896c4a7fb309bf104507225784cd0992c731f
3
  size 4965799096
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f780b5a8be7f7da8f94d1315c3d896b3bfa658fb665eb4e67c9a3b3d709f080d
3
  size 1459729952
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4b7873c402b3ded5eaf4837e845eb6fcd63611690605207852cfc21df2810bed
3
  size 1459729952
special_tokens_map.json CHANGED
@@ -13,11 +13,5 @@
13
  "rstrip": false,
14
  "single_word": false
15
  },
16
- "pad_token": {
17
- "content": "<|finetune_right_pad_id|>",
18
- "lstrip": false,
19
- "normalized": false,
20
- "rstrip": false,
21
- "single_word": false
22
- }
23
  }
 
13
  "rstrip": false,
14
  "single_word": false
15
  },
16
+ "pad_token": "<|finetune_right_pad_id|>"
 
 
 
 
 
 
17
  }
tokenizer_config.json CHANGED
@@ -2053,15 +2053,11 @@
2053
  "chat_template": "{{- bos_token }}\n{% set ns = namespace(system_message='') %}\n{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n {% set ns.system_message = message['content'].strip() %}\n {%- elif message['role'] == 'user' %}\n {{- '<|start_header_id|>user<|end_header_id|>' + ns.system_message + '\\n' + message['content'].strip() + '<|eot_id|>' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '<|start_header_id|>assistant<|end_header_id|>' + message['content'] + '<|eot_id|>' }}\n {%- endif %}\n{%- endfor %}\n",
2054
  "clean_up_tokenization_spaces": true,
2055
  "eos_token": "<|end_of_text|>",
2056
- "max_length": 131072,
2057
  "model_input_names": [
2058
  "input_ids",
2059
  "attention_mask"
2060
  ],
2061
  "model_max_length": 131072,
2062
  "pad_token": "<|finetune_right_pad_id|>",
2063
- "stride": 0,
2064
- "tokenizer_class": "PreTrainedTokenizerFast",
2065
- "truncation_side": "right",
2066
- "truncation_strategy": "longest_first"
2067
  }
 
2053
  "chat_template": "{{- bos_token }}\n{% set ns = namespace(system_message='') %}\n{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n {% set ns.system_message = message['content'].strip() %}\n {%- elif message['role'] == 'user' %}\n {{- '<|start_header_id|>user<|end_header_id|>' + ns.system_message + '\\n' + message['content'].strip() + '<|eot_id|>' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '<|start_header_id|>assistant<|end_header_id|>' + message['content'] + '<|eot_id|>' }}\n {%- endif %}\n{%- endfor %}\n",
2054
  "clean_up_tokenization_spaces": true,
2055
  "eos_token": "<|end_of_text|>",
 
2056
  "model_input_names": [
2057
  "input_ids",
2058
  "attention_mask"
2059
  ],
2060
  "model_max_length": 131072,
2061
  "pad_token": "<|finetune_right_pad_id|>",
2062
+ "tokenizer_class": "PreTrainedTokenizerFast"
 
 
 
2063
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6058e7bbc92865f32405379f79f3236570d8717a003ddb1c256d6b3837479765
3
  size 5560
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:949ffd2edc274bd55a1b61c4a8ad2fd6c7cd62b02414efc8a46b21604165e7e2
3
  size 5560