s1lv3rj1nx commited on
Commit
88345c9
·
verified ·
1 Parent(s): 9e03249

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - sgoel9/paul_graham_essays
5
+ ---
6
+ This is the trained model file for Ch2 - LLMs are MultiTask Learners.
7
+ This chapter creates a GPT2-124M from scratch for text generation. Please use the `best_model.pt` checkpoint for inference.
8
+ Since, we have pre-trained on a small amount of data, the model has overfitted, but can still generate sensible text.
9
+
10
+ ## Plots
11
+ Loss (Train):
12
+
13
+ ![ch2_05_train_epoch_loss.png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/Ht1Tfjuoqywbf5GF06jMx.png)
14
+
15
+ Perplexity (Train):
16
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/psCddxI08z64FKzPH3ADk.png)
17
+
18
+ Loss (Val):
19
+
20
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/Ul5sRV2g0HT2CTCU1FQBT.png)
21
+
22
+ Perplexixty (Val):
23
+
24
+
25
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/TmZ6cn7g48q3sAjgsECI5.png)
26
+