This is the trained model file for Ch2 - LLMs are MultiTask Learners. This chapter creates a GPT2-124M from scratch for text generation. Please use the best_model.pt checkpoint for inference. Since, we have pre-trained on a small amount of data, the model has overfitted, but can still generate sensible text.

Plots

Loss (Train):

ch2_05_train_epoch_loss.png

Perplexity (Train): image/png

Loss (Val):

image/png

Perplexixty (Val):

image/png

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train s1lv3rj1nx/ch2