Athspi LLM

๐Ÿง  A small but capable language model for creative story generation, trained on the TinyStories dataset.

Athspi Banner

Model Details

Architecture

  • Model Type: Transformer-based language model
  • Layers: 4
  • Embedding Dim: 384
  • Heads: 6
  • Sequence Length: 128 tokens
  • Parameters: ~28M

Training Data

  • Dataset: TinyStories
  • Training Coverage: 5% of dataset (~100k samples)

Usage

Installation

pip install torch transformers sentencepiece
Downloads last month
25
Safetensors
Model size
45.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using Athspi/athspi-llm 1