Update README.md
Browse files
README.md
CHANGED
@@ -13,9 +13,18 @@ tags:
|
|
13 |
licence: license
|
14 |
---
|
15 |
|
16 |
-
#
|
17 |
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
## Use with mlx
|
21 |
|
@@ -26,7 +35,7 @@ pip install mlx-lm
|
|
26 |
```python
|
27 |
from mlx_lm import load, generate
|
28 |
|
29 |
-
model, tokenizer = load("Alejandroolmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-
|
30 |
|
31 |
prompt="hello"
|
32 |
|
|
|
13 |
licence: license
|
14 |
---
|
15 |
|
16 |
+
# About:
|
17 |
|
18 |
+
**This GRPO trained model is a fine-tuned version of **[**__deepseek-ai/DeepSeek-R1-Distill-Qwen-7B__**](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)** on the **[**__DigitalLearningGmbH/MATH-lighteval__**](https://huggingface.co/datasets/DigitalLearningGmbH/MATH-lighteval)** dataset.**
|
19 |
+
|
20 |
+
*Special thanks to Dongwei for fine-tuning this version of DeepSeek-R1-Distill-Qwen-7B. More information about it can be found here:*
|
21 |
+
[https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math](https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math)
|
22 |
+
|
23 |
+
I simply converted it to MLX format with a quantization of 8-bit for better performance on Apple Silicon Macs (M1,M2,M3,M4 Chips).
|
24 |
+
|
25 |
+
# Alejandroolmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-8bit-mlx
|
26 |
+
|
27 |
+
The Model [Alejandroolmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-8bit-mlx](https://huggingface.co/Alejandroolmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-8bit-mlx) was converted to MLX format from [Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math](https://huggingface.co/Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math) using mlx-lm version **0.20.5**.
|
28 |
|
29 |
## Use with mlx
|
30 |
|
|
|
35 |
```python
|
36 |
from mlx_lm import load, generate
|
37 |
|
38 |
+
model, tokenizer = load("Alejandroolmedo/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-8bit-mlx")
|
39 |
|
40 |
prompt="hello"
|
41 |
|