File size: 1,457 Bytes
d261b91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
195dbdf
 
 
 
 
 
 
d261b91
 
195dbdf
 
d261b91
 
 
 
 
 
195dbdf
 
 
d261b91
195dbdf
d261b91
 
 
 
 
 
661e313
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d261b91
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- gguf
license: apache-2.0
language:
- en
datasets:
- openai/gsm8k
---

# My Reasoning Model

This is my first reasoning model. It is fairly small, and yes, it still gets the answer wrong to how many r's are in the word "strawberry."

You are welcome to use the model as you wish.

## System Prompt Format

Respond in the following format:

```
<reasoning>
...
</reasoning>
<answer>
...
</answer>
```

I fine-tuned the model using `openai/gsm8k`, and to ensure costs do not go insane, I used a single A100.

```

Enjoy, but please note that this model is experimental and I used it to define my pipeline.

I will be testing fine tuning larger more capable models.  I suspect they would add more value in the short term.


---
base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- gguf
license: apache-2.0
language:
- en
---

# Uploaded  model

- **Developed by:** dbands
- **License:** apache-2.0
- **Finetuned from model :** unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit

This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)