SakanaAI
/

TinySwallow-1.5B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mkshing commited on Jan 19

Commit

e250fea

·

verified ·

1 Parent(s): c762803

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -7,11 +7,11 @@ library_name: transformers
 base_model:
 - Qwen/Qwen2.5-1.5B-Instruct
 ---
-# Smol-Swallow-1.5B
 🤗 [Models](https://huggingface.co/SakanaAI) | 📚 [Paper](https://arxiv.org/abs/TODO) | 📝 [Blog](https://sakana.ai/taid/) | 🐦 [Twitter](https://twitter.com/SakanaAILabs)
-**Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
 We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
 The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.

 base_model:
 - Qwen/Qwen2.5-1.5B-Instruct
 ---
+# SmolSwallow-1.5B
 🤗 [Models](https://huggingface.co/SakanaAI) | 📚 [Paper](https://arxiv.org/abs/TODO) | 📝 [Blog](https://sakana.ai/taid/) | 🐦 [Twitter](https://twitter.com/SakanaAILabs)
+**SmolSwallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
 We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
 The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.