My Reasoning Model
This is my first reasoning model. It is fairly small, and yes, it still gets the answer wrong to how many r's are in the word "strawberry."
You are welcome to use the model as you wish.
System Prompt Format
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
I fine-tuned the model using openai/gsm8k
, and to ensure costs do not go insane, I used a single A100.
Enjoy, but please note that this model is experimental and I used it to define my pipeline.
I will be testing fine tuning larger more capable models. I suspect they would add more value in the short term.
---
base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- gguf
license: apache-2.0
language:
- en
---
# Uploaded model
- **Developed by:** dbands
- **License:** apache-2.0
- **Finetuned from model :** unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
- Downloads last month
- 74
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model’s pipeline type.