Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
thon
from datasets import load_dataset
test = load_dataset("wikitext", "wikitext-2-raw-v1", split="test")
encodings = tokenizer("\n\n".join(test["text"]), return_tensors="pt")
With 🤗 Transformers, we can simply pass the input_ids as the labels to our model, and the average negative
log-likelihood for each token is returned as the loss.