This is the trained model file for Ch2 - LLMs are MultiTask Learners.
This chapter creates a GPT2-124M from scratch for text generation. Please use the best_model.pt
checkpoint for inference.
Since, we have pre-trained on a small amount of data, the model has overfitted, but can still generate sensible text.
Plots
Loss (Train):
Loss (Val):
Perplexixty (Val):
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
HF Inference deployability: The model has no library tag.