dfurman commited on
Commit
0eb132f
·
1 Parent(s): 00679cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -40
README.md CHANGED
@@ -23,7 +23,7 @@ The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a pretrained generati
23
 
24
  ## Model Details
25
 
26
- This model was built via parameter-efficient finetuning of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the first 5k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin). Finetuning was executed on 1x A100 (40 GB SXM) for roughly 1 hour on Google Colab.
27
 
28
  - **Developed by:** Daniel Furman
29
  - **Model type:** Decoder-only
@@ -49,8 +49,6 @@ We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/Eleu
49
 
50
  ## Basic Usage
51
 
52
- *Note*: Use the code below to get started with the sft models herein, as ran on 1x A100.
53
-
54
  ```python
55
  !pip install -q -U transformers peft torch accelerate bitsandbytes einops sentencepiece
56
 
@@ -64,38 +62,49 @@ from transformers import (
64
  ```
65
 
66
  ```python
67
- peft_model_id = "dfurman/Mistral-7B-Instruct-DRF-v0.1"
68
  config = PeftConfig.from_pretrained(peft_model_id)
69
 
 
 
 
 
 
 
 
 
 
 
70
  model = AutoModelForCausalLM.from_pretrained(
71
  config.base_model_name_or_path,
72
- torch_dtype=torch.bfloat16,
73
  device_map="auto",
74
  trust_remote_code=True,
75
  )
76
-
77
- tokenizer = AutoTokenizer.from_pretrained(
78
- config.base_model_name_or_path,
79
- use_fast=True,
80
- trust_remote_code=True,
81
  )
82
-
83
- model = PeftModel.from_pretrained(model, peft_model_id)
84
-
85
- format_template = "You are a helpful assistant. Write a response that appropriately completes the request. {query}\n"
86
  ```
87
 
88
  ```python
89
- query = "Write a short email inviting my friends to a dinner party on Friday. Respond succinctly."
90
- prompt = format_template.format(query=query)
 
 
 
 
 
 
91
 
 
92
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
93
  with torch.autocast("cuda", dtype=torch.bfloat16):
94
  output = model.generate(
95
  input_ids=input_ids,
96
- max_new_tokens=512,
97
  do_sample=True,
98
- temperature=0.1,
99
  return_dict_in_generate=True,
100
  eos_token_id=tokenizer.eos_token_id,
101
  pad_token_id=tokenizer.pad_token_id,
@@ -103,33 +112,34 @@ with torch.autocast("cuda", dtype=torch.bfloat16):
103
  no_repeat_ngram_size=5,
104
  )
105
 
106
- print("\n\n*** Generate:")
107
- print(tokenizer.decode(output["sequences"][0][len(input_ids[0]):], skip_special_tokens=True))
 
 
 
108
  ```
109
 
110
  <details>
111
 
112
  <summary>Output</summary>
113
 
114
- **Prompt**: Write a short email inviting my friends to a dinner party on Friday. Respond succinctly.
115
-
116
- **Generation**: The invitation should be brief and to-the-point, so it's best to use simple language and avoid unnecessary details or long explanations. Here is an example of a concise invitation:
117
-
118
- Dear Friends,
119
-
120
- I hope you can join me for a fun evening at my place this Friday! We'll have delicious food, great conversation, and maybe even some games if we feel like it. Please RSVP by Wednesday night so I know who will be there.
121
-
122
- Looking forward to seeing you all soon!
123
 
124
- Best regards,
125
- Your Name
126
 
127
- This message clearly communicates the essential information about the event while maintaining a friendly tone. It also includes a specific date (Friday) and timeframe (evening), as well as a clear call to action (RSVP). The closing line adds a personal touch and expresses excitement for the gathering. Overall, this invitation strikes a good balance between being informative and engaging without overwhelming the reader with too much text.
 
 
128
 
129
- Remember, when writing emails, always keep in mind your audience and their preferences. If they prefer more detailed information or additional context, adjust accordingly. However, try not to make the invitation overly complicated or lengthy – simplicity often makes for a better experience. Happy emailing!
130
 
131
  </details>
132
 
 
133
  ## Speeds, Sizes, Times
134
 
135
  | runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
@@ -138,17 +148,13 @@ Remember, when writing emails, always keep in mind your audience and their prefe
138
 
139
  ## Training
140
 
141
- It took ~1 hour to train 1 epoch on 1x A100.
142
 
143
  Prompt format:
144
- This model uses the [ChatML](https://huggingface.co/docs/transformers/chat_templating#what-template-should-i-use) prompt format, which was developed by OpenAI.
145
 
146
  ```
147
- <|im_start|>system
148
- You are a helpful assistant.<|im_end|>
149
- <|im_start|>user
150
- {prompt}<|im_end|>
151
- <|im_start|>assistant
152
  ```
153
 
154
  ## Training Hyperparameters
 
23
 
24
  ## Model Details
25
 
26
+ This model was built via parameter-efficient finetuning of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on the first 5k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin). Finetuning was executed on 1x A100 (40 GB SXM) for roughly 1 hour on Google Colab.
27
 
28
  - **Developed by:** Daniel Furman
29
  - **Model type:** Decoder-only
 
49
 
50
  ## Basic Usage
51
 
 
 
52
  ```python
53
  !pip install -q -U transformers peft torch accelerate bitsandbytes einops sentencepiece
54
 
 
62
  ```
63
 
64
  ```python
65
+ peft_model_id = "dfurman/Yi-6B-instruct-v0.1"
66
  config = PeftConfig.from_pretrained(peft_model_id)
67
 
68
+ tokenizer = AutoTokenizer.from_pretrained(
69
+ peft_model_id,
70
+ use_fast=True,
71
+ trust_remote_code=True,
72
+ )
73
+ bnb_config = BitsAndBytesConfig(
74
+ load_in_4bit=True,
75
+ bnb_4bit_quant_type="nf4",
76
+ bnb_4bit_compute_dtype=torch.bfloat16,
77
+ )
78
  model = AutoModelForCausalLM.from_pretrained(
79
  config.base_model_name_or_path,
80
+ quantization_config=bnb_config,
81
  device_map="auto",
82
  trust_remote_code=True,
83
  )
84
+ model = PeftModel.from_pretrained(
85
+ model,
86
+ peft_model_id
 
 
87
  )
 
 
 
 
88
  ```
89
 
90
  ```python
91
+ messages = [
92
+ {"role": "system", "content": "You are a helpful assistant. Respond as briefly as possible."},
93
+ {"role": "user", "content": "Tell me a recipe for a mai tai."},
94
+ ]
95
+
96
+ print("\n\n*** Prompt:")
97
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
98
+ print(prompt)
99
 
100
+ print("\n\n*** Generate:")
101
  input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
102
  with torch.autocast("cuda", dtype=torch.bfloat16):
103
  output = model.generate(
104
  input_ids=input_ids,
105
+ max_new_tokens=1024,
106
  do_sample=True,
107
+ temperature=0.7,
108
  return_dict_in_generate=True,
109
  eos_token_id=tokenizer.eos_token_id,
110
  pad_token_id=tokenizer.pad_token_id,
 
112
  no_repeat_ngram_size=5,
113
  )
114
 
115
+ response = tokenizer.decode(
116
+ output["sequences"][0][len(input_ids[0]):],
117
+ skip_special_tokens=True
118
+ )
119
+ print(response)
120
  ```
121
 
122
  <details>
123
 
124
  <summary>Output</summary>
125
 
126
+ **Prompt**: <|im_start|>system
127
+ You are a helpful assistant. Respond as briefly as possible.<|im_end|>
128
+ <|im_start|>user
129
+ Tell me a recipe for a mai tai.<|im_end|>
130
+ <|im_start|>assistant
 
 
 
 
131
 
132
+ **Generation**: Here's one simple version of the classic Mai Tai cocktail:
 
133
 
134
+ 1 oz White Rum (Bacardi, Don Papa, etc.) ½ oz Coconut Cream Liqueur (Malibu or Coco Lopez)
135
+ 2 tsp Simple Syrup ➕ Dash Orange Bitters
136
+ 3-4 Ice Cubes
137
 
138
+ Shake all ingredients in a shaker filled with ice until well chilled and strain into an old fashioned glass over fresh crushed ice. Garnish with mint leaves if desired. Enjoy!
139
 
140
  </details>
141
 
142
+
143
  ## Speeds, Sizes, Times
144
 
145
  | runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
 
148
 
149
  ## Training
150
 
151
+ It took ~3 hours to train 3 epochs on 1x A100 (40 GB SXM).
152
 
153
  Prompt format:
154
+ This model uses the same prompt format as [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1). This model does **not** expect a system prompt.
155
 
156
  ```
157
+ [INST] {prompt} [/INST]
 
 
 
 
158
  ```
159
 
160
  ## Training Hyperparameters