dfurman commited on
Commit
70c8f6f
·
1 Parent(s): ac01a3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -56
README.md CHANGED
@@ -20,7 +20,7 @@ General instruction-following llm finetuned from [mistralai/Mistral-7B-v0.1](htt
20
 
21
  ### Model Description
22
 
23
- This instruction-following llm was built via parameter-efficient QLoRA finetuning of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the first 200k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin). Finetuning was executed on 1x A100 (40 GB SXM) for roughly 20 hours on Google Colab. **Only** the `peft` adapter weights are included in this model repo, alonside the tokenizer.
24
 
25
  - **Developed by:** Daniel Furman
26
  - **Model type:** Decoder-only
@@ -32,7 +32,7 @@ This instruction-following llm was built via parameter-efficient QLoRA finetunin
32
 
33
  - **Repository:** [github.com/daniel-furman/sft-demos](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/mistral/sft-mistral-7b-instruct-peft.ipynb)
34
 
35
- ### Evaluation Results
36
 
37
  | Metric | Value |
38
  |-----------------------|-------|
@@ -44,64 +44,21 @@ This instruction-following llm was built via parameter-efficient QLoRA finetunin
44
 
45
  We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
46
 
 
47
 
48
- ## Uses
49
-
50
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
51
-
52
- ### Direct Use
53
-
54
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
55
-
56
- [More Information Needed]
57
-
58
- ### Downstream Use
59
-
60
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
61
-
62
- [More Information Needed]
63
-
64
- ### Out-of-Scope Use
65
-
66
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
67
-
68
- [More Information Needed]
69
-
70
- ## Bias, Risks, and Limitations
71
-
72
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
73
-
74
- [More Information Needed]
75
-
76
- ### Recommendations
77
-
78
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
79
-
80
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
81
-
82
- ## How to Get Started with the Model
83
-
84
- Use the code below to get started with the model.
85
-
86
- [More Information Needed]
87
-
88
- ## Training Details
89
-
90
- ### Training Data
91
-
92
- <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
93
-
94
- [More Information Needed]
95
-
96
-
97
- ### Preprocessing
98
-
99
- [More Information Needed]
100
-
101
 
 
 
 
 
 
 
 
 
102
  ### Training Hyperparameters
103
 
104
- We used the [`SFTTrainer` from TRL library](https://huggingface.co/docs/trl/main/en/sft_trainer) that gives a wrapper around transformers `Trainer` to easily fine-tune models on instruction based datasets.
105
 
106
  The following `TrainingArguments` config was used:
107
 
@@ -130,6 +87,12 @@ The following `bitsandbytes` quantization config was used:
130
  - bnb_4bit_use_double_quant: False
131
  - bnb_4bit_compute_dtype: bfloat16
132
 
 
 
 
 
 
 
133
  ### Speeds, Sizes, Times
134
 
135
  | runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |
 
20
 
21
  ### Model Description
22
 
23
+ This instruction-following llm was built via parameter-efficient QLoRA finetuning of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the first 5k rows of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin). Finetuning was executed on 1x A100 (40 GB SXM) for roughly 1 hour on Google Colab. **Only** the `peft` adapter weights are included in this model repo, alonside the tokenizer.
24
 
25
  - **Developed by:** Daniel Furman
26
  - **Model type:** Decoder-only
 
32
 
33
  - **Repository:** [github.com/daniel-furman/sft-demos](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/one_gpu/mistral/sft-mistral-7b-instruct-peft.ipynb)
34
 
35
+ ### Evaluation
36
 
37
  | Metric | Value |
38
  |-----------------------|-------|
 
44
 
45
  We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
46
 
47
+ ## Training
48
 
49
+ It took ~1 hour to train 1 epoch on 1x A100.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
+ Prompt format:
52
+ This model (and all my future releases) use [ChatML](https://huggingface.co/docs/transformers/chat_templating#what-template-should-i-use) prompt format.
53
+ ```
54
+ <|im_start|>system
55
+ You are a helpful assistant.<|im_end|>
56
+ <|im_start|>user
57
+ {prompt}<|im_end|>
58
+ <|im_start|>assistant
59
  ### Training Hyperparameters
60
 
61
+ We use the [`SFTTrainer`] (https://huggingface.co/docs/trl/main/en/sft_trainer) from 🤗's TRL package to easily fine-tune llms on instruction-following datasets.
62
 
63
  The following `TrainingArguments` config was used:
64
 
 
87
  - bnb_4bit_use_double_quant: False
88
  - bnb_4bit_compute_dtype: bfloat16
89
 
90
+ ## How to Get Started with the Model
91
+
92
+ Use the code below to get started with the model.
93
+
94
+ [More Information Needed]
95
+
96
  ### Speeds, Sizes, Times
97
 
98
  | runtime / 50 tokens (sec) | GPU | attn | torch dtype | VRAM (GB) |