Spaces:
Runtime error
Runtime error
<!--Copyright 2021 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
# mLUKE | |
## Overview | |
The mLUKE model was proposed in [mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models](https://arxiv.org/abs/2110.08151) by Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka. It's a multilingual extension | |
of the [LUKE model](https://arxiv.org/abs/2010.01057) trained on the basis of XLM-RoBERTa. | |
It is based on XLM-RoBERTa and adds entity embeddings, which helps improve performance on various downstream tasks | |
involving reasoning about entities such as named entity recognition, extractive question answering, relation | |
classification, cloze-style knowledge completion. | |
The abstract from the paper is the following: | |
*Recent studies have shown that multilingual pretrained language models can be effectively improved with cross-lingual | |
alignment information from Wikipedia entities. However, existing methods only exploit entity information in pretraining | |
and do not explicitly use entities in downstream tasks. In this study, we explore the effectiveness of leveraging | |
entity representations for downstream cross-lingual tasks. We train a multilingual language model with 24 languages | |
with entity representations and show the model consistently outperforms word-based pretrained models in various | |
cross-lingual transfer tasks. We also analyze the model and the key insight is that incorporating entity | |
representations into the input allows us to extract more language-agnostic features. We also evaluate the model with a | |
multilingual cloze prompt task with the mLAMA dataset. We show that entity-based prompt elicits correct factual | |
knowledge more likely than using only word representations.* | |
One can directly plug in the weights of mLUKE into a LUKE model, like so: | |
```python | |
from transformers import LukeModel | |
model = LukeModel.from_pretrained("studio-ousia/mluke-base") | |
``` | |
Note that mLUKE has its own tokenizer, [`MLukeTokenizer`]. You can initialize it as follows: | |
```python | |
from transformers import MLukeTokenizer | |
tokenizer = MLukeTokenizer.from_pretrained("studio-ousia/mluke-base") | |
``` | |
As mLUKE's architecture is equivalent to that of LUKE, one can refer to [LUKE's documentation page](luke) for all | |
tips, code examples and notebooks. | |
This model was contributed by [ryo0634](https://huggingface.co/ryo0634). The original code can be found [here](https://github.com/studio-ousia/luke). | |
## MLukeTokenizer | |
[[autodoc]] MLukeTokenizer | |
- __call__ | |
- save_vocabulary | |