dbands
/

Qwen2.5-3B-Instruct-reason-gguf

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

dbands commited on 4 days ago

Commit

d261b91

·

verified ·

1 Parent(s): d78fe50

Update README.md

Files changed (1) hide show

README.md +35 -1

README.md CHANGED Viewed

@@ -1,3 +1,37 @@
 ---
 base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
 tags:
@@ -19,4 +53,4 @@ language:
 This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

+---
+base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- qwen2
+- gguf
+license: apache-2.0
+language:
+- en
+datasets:
+- openai/gsm8k
+---
+This is my first reasoning model.  It is fairly small and yes it still gets the answer wrong to how many r's in the word strawberry.
+You are welcome to use the model as you whish.
+In your system prompt use:
+Respond in the following format:
+<reasoning>
+...
+</reasoning>
+<answer>
+...
+</answer>
+I fine tuned the model using openai/gsm8k and to ensure costs do not go insane, I used a single A100.
+Enjoy, but please note that this model is experimental and I used it to define my pipeline.
+I will be testing fine tuning larger more capable models.  I suspect they would add more value in the short term.
 ---
 base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit
 tags:
 This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)