SakanaAI
/

TinySwallow-1.5B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mkshing commited on Jan 19

Commit

9886505

·

verified ·

1 Parent(s): 15205d9

Update README.md

Files changed (1) hide show

README.md +0 -33

README.md CHANGED Viewed

@@ -14,39 +14,6 @@ base_model:
 **Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
 We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
 The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.
-## Usage
-Use the code below to get started with the model.
-<details>
-<summary> Click to expand </summary>
-```python
-import torch
-from transformers import AutoModelForCausalLM, AutoTokenizer
-# 1. load model
-device = "cuda" if torch.cuda.is_available() else "CPU"
-repo_id = "SakanaAI/Smol-Swallow-1.5B"
-model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")
-tokenizer = AutoTokenizer.from_pretrained(repo_id)
-model.to(device)
-# 2. prepare inputs
-text = "拝啓\n"
-inputs = tokenizer(text, return_tensors="pt")
-# 3. generate
-output_ids = model.generate(**inputs.to(device))
-output_ids = output_ids[:, inputs.input_ids.shape[1] :]
-generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
-print(generated_text)
-```
-</details>
 ## Model Details

 **Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
 We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
 The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.
 ## Model Details