dbands commited on
Commit
d261b91
·
verified ·
1 Parent(s): d78fe50

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -1,3 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
3
  tags:
@@ -19,4 +53,4 @@ language:
19
 
20
  This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
+ ---
2
+ base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - unsloth
7
+ - qwen2
8
+ - gguf
9
+ license: apache-2.0
10
+ language:
11
+ - en
12
+ datasets:
13
+ - openai/gsm8k
14
+ ---
15
+ This is my first reasoning model. It is fairly small and yes it still gets the answer wrong to how many r's in the word strawberry.
16
+
17
+ You are welcome to use the model as you whish.
18
+
19
+ In your system prompt use:
20
+ Respond in the following format:
21
+ <reasoning>
22
+ ...
23
+ </reasoning>
24
+ <answer>
25
+ ...
26
+ </answer>
27
+
28
+ I fine tuned the model using openai/gsm8k and to ensure costs do not go insane, I used a single A100.
29
+
30
+ Enjoy, but please note that this model is experimental and I used it to define my pipeline.
31
+
32
+ I will be testing fine tuning larger more capable models. I suspect they would add more value in the short term.
33
+
34
+
35
  ---
36
  base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
37
  tags:
 
53
 
54
  This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
55
 
56
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)