Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a pretrained generati
|
|
23 |
|
24 |
## Model Details
|
25 |
|
26 |
-
This model was built via parameter-efficient finetuning of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the first 5k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin). Finetuning was executed on 1x A100 (40 GB SXM) for roughly 1 hour on Google Colab.
|
27 |
|
28 |
- **Developed by:** Daniel Furman
|
29 |
- **Model type:** Decoder-only
|
@@ -49,8 +49,6 @@ We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/Eleu
|
|
49 |
|
50 |
## Basic Usage
|
51 |
|
52 |
-
*Note*: Use the code below to get started with the sft models herein, as ran on 1x A100.
|
53 |
-
|
54 |
```python
|
55 |
!pip install -q -U transformers peft torch accelerate bitsandbytes einops sentencepiece
|
56 |
|
@@ -64,38 +62,49 @@ from transformers import (
|
|
64 |
```
|
65 |
|
66 |
```python
|
67 |
-
peft_model_id = "dfurman/
|
68 |
config = PeftConfig.from_pretrained(peft_model_id)
|
69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
70 |
model = AutoModelForCausalLM.from_pretrained(
|
71 |
config.base_model_name_or_path,
|
72 |
-
|
73 |
device_map="auto",
|
74 |
trust_remote_code=True,
|
75 |
)
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
use_fast=True,
|
80 |
-
trust_remote_code=True,
|
81 |
)
|
82 |
-
|
83 |
-
model = PeftModel.from_pretrained(model, peft_model_id)
|
84 |
-
|
85 |
-
format_template = "You are a helpful assistant. Write a response that appropriately completes the request. {query}\n"
|
86 |
```
|
87 |
|
88 |
```python
|
89 |
-
|
90 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
91 |
|
|
|
92 |
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
|
93 |
with torch.autocast("cuda", dtype=torch.bfloat16):
|
94 |
output = model.generate(
|
95 |
input_ids=input_ids,
|
96 |
-
max_new_tokens=
|
97 |
do_sample=True,
|
98 |
-
temperature=0.
|
99 |
return_dict_in_generate=True,
|
100 |
eos_token_id=tokenizer.eos_token_id,
|
101 |
pad_token_id=tokenizer.pad_token_id,
|
@@ -103,33 +112,34 @@ with torch.autocast("cuda", dtype=torch.bfloat16):
|
|
103 |
no_repeat_ngram_size=5,
|
104 |
)
|
105 |
|
106 |
-
|
107 |
-
|
|
|
|
|
|
|
108 |
```
|
109 |
|
110 |
<details>
|
111 |
|
112 |
<summary>Output</summary>
|
113 |
|
114 |
-
**Prompt**:
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
I hope you can join me for a fun evening at my place this Friday! We'll have delicious food, great conversation, and maybe even some games if we feel like it. Please RSVP by Wednesday night so I know who will be there.
|
121 |
-
|
122 |
-
Looking forward to seeing you all soon!
|
123 |
|
124 |
-
|
125 |
-
Your Name
|
126 |
|
127 |
-
|
|
|
|
|
128 |
|
129 |
-
|
130 |
|
131 |
</details>
|
132 |
|
|
|
133 |
## Speeds, Sizes, Times
|
134 |
|
135 |
| runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
|
@@ -138,17 +148,13 @@ Remember, when writing emails, always keep in mind your audience and their prefe
|
|
138 |
|
139 |
## Training
|
140 |
|
141 |
-
It took ~
|
142 |
|
143 |
Prompt format:
|
144 |
-
This model uses the [
|
145 |
|
146 |
```
|
147 |
-
|
148 |
-
You are a helpful assistant.<|im_end|>
|
149 |
-
<|im_start|>user
|
150 |
-
{prompt}<|im_end|>
|
151 |
-
<|im_start|>assistant
|
152 |
```
|
153 |
|
154 |
## Training Hyperparameters
|
|
|
23 |
|
24 |
## Model Details
|
25 |
|
26 |
+
This model was built via parameter-efficient finetuning of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the first 5k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin). Finetuning was executed on 1x A100 (40 GB SXM) for roughly 1 hour on Google Colab.
|
27 |
|
28 |
- **Developed by:** Daniel Furman
|
29 |
- **Model type:** Decoder-only
|
|
|
49 |
|
50 |
## Basic Usage
|
51 |
|
|
|
|
|
52 |
```python
|
53 |
!pip install -q -U transformers peft torch accelerate bitsandbytes einops sentencepiece
|
54 |
|
|
|
62 |
```
|
63 |
|
64 |
```python
|
65 |
+
peft_model_id = "dfurman/Yi-6B-instruct-v0.1"
|
66 |
config = PeftConfig.from_pretrained(peft_model_id)
|
67 |
|
68 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
69 |
+
peft_model_id,
|
70 |
+
use_fast=True,
|
71 |
+
trust_remote_code=True,
|
72 |
+
)
|
73 |
+
bnb_config = BitsAndBytesConfig(
|
74 |
+
load_in_4bit=True,
|
75 |
+
bnb_4bit_quant_type="nf4",
|
76 |
+
bnb_4bit_compute_dtype=torch.bfloat16,
|
77 |
+
)
|
78 |
model = AutoModelForCausalLM.from_pretrained(
|
79 |
config.base_model_name_or_path,
|
80 |
+
quantization_config=bnb_config,
|
81 |
device_map="auto",
|
82 |
trust_remote_code=True,
|
83 |
)
|
84 |
+
model = PeftModel.from_pretrained(
|
85 |
+
model,
|
86 |
+
peft_model_id
|
|
|
|
|
87 |
)
|
|
|
|
|
|
|
|
|
88 |
```
|
89 |
|
90 |
```python
|
91 |
+
messages = [
|
92 |
+
{"role": "system", "content": "You are a helpful assistant. Respond as briefly as possible."},
|
93 |
+
{"role": "user", "content": "Tell me a recipe for a mai tai."},
|
94 |
+
]
|
95 |
+
|
96 |
+
print("\n\n*** Prompt:")
|
97 |
+
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
98 |
+
print(prompt)
|
99 |
|
100 |
+
print("\n\n*** Generate:")
|
101 |
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
|
102 |
with torch.autocast("cuda", dtype=torch.bfloat16):
|
103 |
output = model.generate(
|
104 |
input_ids=input_ids,
|
105 |
+
max_new_tokens=1024,
|
106 |
do_sample=True,
|
107 |
+
temperature=0.7,
|
108 |
return_dict_in_generate=True,
|
109 |
eos_token_id=tokenizer.eos_token_id,
|
110 |
pad_token_id=tokenizer.pad_token_id,
|
|
|
112 |
no_repeat_ngram_size=5,
|
113 |
)
|
114 |
|
115 |
+
response = tokenizer.decode(
|
116 |
+
output["sequences"][0][len(input_ids[0]):],
|
117 |
+
skip_special_tokens=True
|
118 |
+
)
|
119 |
+
print(response)
|
120 |
```
|
121 |
|
122 |
<details>
|
123 |
|
124 |
<summary>Output</summary>
|
125 |
|
126 |
+
**Prompt**: <|im_start|>system
|
127 |
+
You are a helpful assistant. Respond as briefly as possible.<|im_end|>
|
128 |
+
<|im_start|>user
|
129 |
+
Tell me a recipe for a mai tai.<|im_end|>
|
130 |
+
<|im_start|>assistant
|
|
|
|
|
|
|
|
|
131 |
|
132 |
+
**Generation**: Here's one simple version of the classic Mai Tai cocktail:
|
|
|
133 |
|
134 |
+
1 oz White Rum (Bacardi, Don Papa, etc.) ➕ ½ oz Coconut Cream Liqueur (Malibu or Coco Lopez)
|
135 |
+
2 tsp Simple Syrup ➕ Dash Orange Bitters
|
136 |
+
3-4 Ice Cubes
|
137 |
|
138 |
+
Shake all ingredients in a shaker filled with ice until well chilled and strain into an old fashioned glass over fresh crushed ice. Garnish with mint leaves if desired. Enjoy!
|
139 |
|
140 |
</details>
|
141 |
|
142 |
+
|
143 |
## Speeds, Sizes, Times
|
144 |
|
145 |
| runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
|
|
|
148 |
|
149 |
## Training
|
150 |
|
151 |
+
It took ~3 hours to train 3 epochs on 1x A100 (40 GB SXM).
|
152 |
|
153 |
Prompt format:
|
154 |
+
This model uses the same prompt format as [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1). This model does **not** expect a system prompt.
|
155 |
|
156 |
```
|
157 |
+
[INST] {prompt} [/INST]
|
|
|
|
|
|
|
|
|
158 |
```
|
159 |
|
160 |
## Training Hyperparameters
|