mkshing commited on
Commit
9886505
·
verified ·
1 Parent(s): 15205d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -33
README.md CHANGED
@@ -14,39 +14,6 @@ base_model:
14
  **Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
15
  We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
16
  The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.
17
-
18
- ## Usage
19
-
20
- Use the code below to get started with the model.
21
-
22
- <details>
23
- <summary> Click to expand </summary>
24
-
25
- ```python
26
- import torch
27
- from transformers import AutoModelForCausalLM, AutoTokenizer
28
-
29
-
30
- # 1. load model
31
- device = "cuda" if torch.cuda.is_available() else "CPU"
32
- repo_id = "SakanaAI/Smol-Swallow-1.5B"
33
- model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")
34
- tokenizer = AutoTokenizer.from_pretrained(repo_id)
35
- model.to(device)
36
-
37
- # 2. prepare inputs
38
- text = "拝啓\n"
39
- inputs = tokenizer(text, return_tensors="pt")
40
-
41
- # 3. generate
42
- output_ids = model.generate(**inputs.to(device))
43
- output_ids = output_ids[:, inputs.input_ids.shape[1] :]
44
- generated_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
45
- print(generated_text)
46
- ```
47
-
48
- </details>
49
-
50
 
51
  ## Model Details
52
 
 
14
  **Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
15
  We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
16
  The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ## Model Details
19