Pretrained from scratch using GPT-2 architecture and a dataset of Latin texts (Corpus Corporum) 64 token context, loss 4.5, trained on 1 epoch of 492 million tokens GPT2 style tokenizer trained with min_frequency of 2000

Tends to get repetitive and is not very coherent, due to size and limited data.

Downloads last month: 182

Safetensors

Model size

99.3M params

Tensor type

F32

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

gaodrew
/

cicero

Dataset used to train gaodrew/cicero