Update README.md
Browse files
README.md
CHANGED
@@ -34,7 +34,7 @@ With this model, I took a **different** approach—it's designed **less for typi
|
|
34 |
### What is this model?
|
35 |
This is a **lightly fine-tuned** version of the Mistral 24B base model, designed as an accessible and adaptable foundation for further fine-tuning and merging fodder. Key modifications include:
|
36 |
|
37 |
-
- **ChatML-ified**, with no additional tokens introduced.
|
38 |
- **High quality private instruct**—not generated by ChatGPT or Claude, ensuring no slop and good markdown understanding.
|
39 |
- **No refusals**—since it’s a base model, refusals should be minimal to non-existent, though, in early testing, occasional warnings still appear (I assume some were baked into the pre-train).
|
40 |
- **High-quality private creative writing dataset** Mainly to dilute baked-in slop further, but it can actually write some stories, not bad for loss ~8.
|
|
|
34 |
### What is this model?
|
35 |
This is a **lightly fine-tuned** version of the Mistral 24B base model, designed as an accessible and adaptable foundation for further fine-tuning and merging fodder. Key modifications include:
|
36 |
|
37 |
+
- **ChatML-ified**, with no additional tokens introduced. **Update**, I did a small oopsie. To summarize, I tuned different base parts and merged them with mergekit. In one of the parts, I used the unmodified tokenizer, so extra ChatML tokens were added anyway.
|
38 |
- **High quality private instruct**—not generated by ChatGPT or Claude, ensuring no slop and good markdown understanding.
|
39 |
- **No refusals**—since it’s a base model, refusals should be minimal to non-existent, though, in early testing, occasional warnings still appear (I assume some were baked into the pre-train).
|
40 |
- **High-quality private creative writing dataset** Mainly to dilute baked-in slop further, but it can actually write some stories, not bad for loss ~8.
|