GPT Neo

Overview

The GPTNeo model was released in the EleutherAI/gpt-neo repository by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy. It is a GPT2 like causal language model trained on the Pile dataset.

The architecture is similar to GPT2 except that GPT Neo uses local attention in every other layer with a window size of 256 tokens.

This model was contributed by valhalla.

Generation

The generate() method can be used to generate text using GPT Neo model.

>>> from transformers import GPTNeoForCausalLM, GPT2Tokenizer

>>> model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")
>>> tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")

>>> prompt = (
...     "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
...     "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
...     "researchers was the fact that the unicorns spoke perfect English."
... )

>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids

>>> gen_tokens = model.generate(
...     input_ids,
...     do_sample=True,
...     temperature=0.9,
...     max_length=100,
... )
>>> gen_text = tokenizer.batch_decode(gen_tokens)[0]

Documentation resources

GPTNeoConfig

[[autodoc]] GPTNeoConfig

GPTNeoModel

[[autodoc]] GPTNeoModel - forward

GPTNeoForCausalLM

[[autodoc]] GPTNeoForCausalLM - forward

GPTNeoForQuestionAnswering

[[autodoc]] GPTNeoForQuestionAnswering - forward

GPTNeoForSequenceClassification

[[autodoc]] GPTNeoForSequenceClassification - forward

GPTNeoForTokenClassification

[[autodoc]] GPTNeoForTokenClassification - forward

FlaxGPTNeoModel

[[autodoc]] FlaxGPTNeoModel - call

FlaxGPTNeoForCausalLM

[[autodoc]] FlaxGPTNeoForCausalLM - call