File size: 145 Bytes
5fa1a76
 
1
2
The smaller the stride, the more context the model will have in making each prediction,
and the better the reported perplexity will typically be.