smollm-turkish-base

Turkish base model with early stopped training

Model Description

  • Model Type: LLaMA Architecture
  • Training Framework: Nanotron
  • Base Tokenizer: bonur/gpt2-turkish-tokenizer
  • Context Length: 4096
  • Vocab Size: 52000
  • Hidden Size: 576
  • Number of Layers: 30
  • Number of Attention Heads: 9
  • Number of Key/Value Heads: 3

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bonur/smollm-turkish-base")
tokenizer = AutoTokenizer.from_pretrained("bonur/smollm-turkish-base")

text = "Your prompt here"
inputs = tokenizer(text, return_tensors="pt", padding=True)
outputs = model.generate(
    inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
Downloads last month
3
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.