5fa1a76
1
2
The smaller the stride, the more context the model will have in making each prediction, and the better the reported perplexity will typically be.