File size: 604 Bytes
9c8cbed eb81ab9 7acce2e 19eb6c7 47a3cb0 9c8cbed 912e2fb feb9a45 912e2fb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
---
datasets:
- wikitext
- wikitext-103-v1
language:
- en
metrics:
- perplexity
- cross_entropy
---
**(!) _Don't forget to preprocess unknown_tokens and substitute them with <|endoftext|>. Otherwise the \<unk\> tokens in dataset will be split into the '<', 'unk' and '>' tokens_**
- Full context (1024) perplexity on test set: **13.68**
**Dependence of the cross entropy loss on the length of the context for prediction**
- x-axis*128 = context length
- y-axis = cross entropy

|