s1lv3rj1nx
/

ch2

Model card Files Files and versions Community

This is the trained model file for Ch2 - LLMs are MultiTask Learners. This chapter creates a GPT2-124M from scratch for text generation. Please use the best_model.pt checkpoint for inference. Since, we have pre-trained on a small amount of data, the model has overfitted, but can still generate sensible text.

Plots

Loss (Train):

Perplexity (Train):

Loss (Val):

Perplexixty (Val):

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Plots

Dataset used to train s1lv3rj1nx/ch2