Latest Version: 232,775 Steps

  • 14,897,600 tokens seen.

Model Info:

  • Trained from scratch.
  • 6.86M parameters.
  • 64 context length.

Config

batch_size: 1
dropout: 0
learning_rate: 0.0001
max_length: 64
n_embed: 256
n_head: 8
n_layer: 8
vocab_size: 2048
Downloads last month
171
Safetensors
Model size
6.86M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.