Spaces:
Runtime error
Runtime error
<!--Copyright 2022 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# XLM-RoBERTa-XL | |
## Overview | |
The XLM-RoBERTa-XL model was proposed in [Larger-Scale Transformers for Multilingual Masked Language Modeling](https://arxiv.org/abs/2105.00572) by Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau. | |
The abstract from the paper is the following: | |
*Recent work has demonstrated the effectiveness of cross-lingual language model pretraining for cross-lingual understanding. In this study, we present the results of two larger multilingual masked language models, with 3.5B and 10.7B parameters. Our two new models dubbed XLM-R XL and XLM-R XXL outperform XLM-R by 1.8% and 2.4% average accuracy on XNLI. Our model also outperforms the RoBERTa-Large model on several English tasks of the GLUE benchmark by 0.3% on average while handling 99 more languages. This suggests pretrained models with larger capacity may obtain both strong performance on high-resource languages while greatly improving low-resource languages. We make our code and models publicly available.* | |
Tips: | |
- XLM-RoBERTa-XL is a multilingual model trained on 100 different languages. Unlike some XLM multilingual models, it does | |
not require `lang` tensors to understand which language is used, and should be able to determine the correct | |
language from the input ids. | |
This model was contributed by [Soonhwan-Kwon](https://github.com/Soonhwan-Kwon) and [stefan-it](https://huggingface.co/stefan-it). The original code can be found [here](https://github.com/pytorch/fairseq/tree/master/examples/xlmr). | |
## Documentation resources | |
- [Text classification task guide](../tasks/sequence_classification) | |
- [Token classification task guide](../tasks/token_classification) | |
- [Question answering task guide](../tasks/question_answering) | |
- [Causal language modeling task guide](../tasks/language_modeling) | |
- [Masked language modeling task guide](../tasks/masked_language_modeling) | |
- [Multiple choice task guide](../tasks/multiple_choice) | |
## XLMRobertaXLConfig | |
[[autodoc]] XLMRobertaXLConfig | |
## XLMRobertaXLModel | |
[[autodoc]] XLMRobertaXLModel | |
- forward | |
## XLMRobertaXLForCausalLM | |
[[autodoc]] XLMRobertaXLForCausalLM | |
- forward | |
## XLMRobertaXLForMaskedLM | |
[[autodoc]] XLMRobertaXLForMaskedLM | |
- forward | |
## XLMRobertaXLForSequenceClassification | |
[[autodoc]] XLMRobertaXLForSequenceClassification | |
- forward | |
## XLMRobertaXLForMultipleChoice | |
[[autodoc]] XLMRobertaXLForMultipleChoice | |
- forward | |
## XLMRobertaXLForTokenClassification | |
[[autodoc]] XLMRobertaXLForTokenClassification | |
- forward | |
## XLMRobertaXLForQuestionAnswering | |
[[autodoc]] XLMRobertaXLForQuestionAnswering | |
- forward | |