Ahmadzei's picture
update 1
57bdca5
raw
history blame contribute delete
586 Bytes
This allows computation to proceed much faster while still giving the model a large context to make
predictions at each step.
Example: Calculating perplexity with GPT-2 in 🤗 Transformers
Let's demonstrate this process with GPT-2.
thon
from transformers import GPT2LMHeadModel, GPT2TokenizerFast
device = "cuda"
model_id = "openai-community/gpt2-large"
model = GPT2LMHeadModel.from_pretrained(model_id).to(device)
tokenizer = GPT2TokenizerFast.from_pretrained(model_id)
We'll load in the WikiText-2 dataset and evaluate the perplexity using a few different sliding-window strategies.