Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
XLNet | |
---------------------------------------------------- | |
Overview | |
~~~~~~~~~~~~~~~~~~~~~ | |
The XLNet model was proposed in `XLNet: Generalized Autoregressive Pretraining for Language Understanding <https://arxiv.org/abs/1906.08237>`_ | |
by Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le. | |
XLnet is an extension of the Transformer-XL model pre-trained using an autoregressive method | |
to learn bidirectional contexts by maximizing the expected likelihood over all permutations | |
of the input sequence factorization order. | |
The abstract from the paper is the following: | |
*With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves | |
better performance than pretraining approaches based on autoregressive language modeling. However, relying on | |
corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a | |
pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive | |
pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over | |
all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive | |
formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, | |
into pretraining. Empirically, under comparable experiment settings, XLNet outperforms BERT on 20 tasks, often by | |
a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.* | |
Tips: | |
- The specific attention pattern can be controlled at training and test time using the `perm_mask` input. | |
- Due to the difficulty of training a fully auto-regressive model over various factorization order, | |
XLNet is pretrained using only a sub-set of the output tokens as target which are selected | |
with the `target_mapping` input. | |
- To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the `perm_mask` and | |
`target_mapping` inputs to control the attention span and outputs (see examples in `examples/run_generation.py`) | |
- XLNet is one of the few models that has no sequence length limit. | |
XLNetConfig | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetConfig | |
:members: | |
XLNetTokenizer | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetTokenizer | |
:members: | |
XLNetModel | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetModel | |
:members: | |
XLNetLMHeadModel | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetLMHeadModel | |
:members: | |
XLNetForSequenceClassification | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetForSequenceClassification | |
:members: | |
XLNetForTokenClassification | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetForTokenClassification | |
:members: | |
XLNetForMultipleChoice | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetForMultipleChoice | |
:members: | |
XLNetForQuestionAnsweringSimple | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetForQuestionAnsweringSimple | |
:members: | |
XLNetForQuestionAnswering | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.XLNetForQuestionAnswering | |
:members: | |
TFXLNetModel | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.TFXLNetModel | |
:members: | |
TFXLNetLMHeadModel | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.TFXLNetLMHeadModel | |
:members: | |
TFXLNetForSequenceClassification | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.TFXLNetForSequenceClassification | |
:members: | |
TFXLNetForQuestionAnsweringSimple | |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
.. autoclass:: transformers.TFXLNetForQuestionAnsweringSimple | |
:members: | |