File size: 958 Bytes
88345c9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
---
license: apache-2.0
datasets:
- sgoel9/paul_graham_essays
---
This is the trained model file for Ch2 - LLMs are MultiTask Learners.
This chapter creates a GPT2-124M from scratch for text generation. Please use the `best_model.pt` checkpoint for inference.
Since, we have pre-trained on a small amount of data, the model has overfitted, but can still generate sensible text.
## Plots
Loss (Train):

Perplexity (Train):

Loss (Val):

Perplexixty (Val):

|