File size: 476 Bytes
57bdca5 |
1 2 3 4 |
If we have a tokenized sequence \(X = (x_0, x_1, \dots, x_t)\), then the perplexity of \(X\) is, $$\text{PPL}(X) = \exp \left{ {-\frac{1}{t}\sum_i^t \log p_\theta (x_i|x_{<i}) } \right}$$ where \(\log p_\theta (x_i|x_{<i})\) is the log-likelihood of the ith token conditioned on the preceding tokens \(x_{<i}\) according to our model. Intuitively, it can be thought of as an evaluation of the model's ability to predict uniformly among the set of specified tokens in a corpus. |