![](https://cdn-avatars.huggingface.co/v1/production/uploads/60bccec062080d33f875cd0c/FDQwEj3IBUG_fa377Ej1U.jpeg)
BEE-spoke-data/smol_llama-101M-GQA
Text Generation
•
Updated
•
4.32k
•
28
small-scale pretraining experiments of mine
Note smol_llama-220M-GQA CPT on fineweb-edu for 10 billion tokens
Note this is a mid-training checkpoint of what is now smol_llama-220M