smollm-turkish-base
Turkish base model with early stopped training
Model Description
- Model Type: LLaMA Architecture
- Training Framework: Nanotron
- Base Tokenizer: bonur/gpt2-turkish-tokenizer
- Context Length: 4096
- Vocab Size: 52000
- Hidden Size: 576
- Number of Layers: 30
- Number of Attention Heads: 9
- Number of Key/Value Heads: 3
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("bonur/smollm-turkish-base")
tokenizer = AutoTokenizer.from_pretrained("bonur/smollm-turkish-base")
text = "Your prompt here"
inputs = tokenizer(text, return_tensors="pt", padding=True)
outputs = model.generate(
inputs.input_ids,
attention_mask=inputs.attention_mask,
max_new_tokens=100,
do_sample=True,
temperature=0.7,
top_p=0.9
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
- Downloads last month
- 3
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.