Commit
·
9a1c754
1
Parent(s):
549b3a0
Add README.md
Browse files
README.md
CHANGED
@@ -8,8 +8,8 @@ datasets:
|
|
8 |
- bookcorpus
|
9 |
- wikipedia
|
10 |
---
|
11 |
-
# MultiBERTs Seed
|
12 |
-
Seed
|
13 |
[this paper](https://arxiv.org/pdf/2106.16163.pdf) and first released in
|
14 |
[this repository](https://github.com/google-research/language/tree/master/language/multiberts). This model is uncased: it does not make a difference
|
15 |
between english and English.
|
@@ -42,7 +42,7 @@ generation you should look at model like GPT2.
|
|
42 |
Here is how to use this model to get the features of a given text in PyTorch:
|
43 |
```python
|
44 |
from transformers import BertTokenizer, BertModel
|
45 |
-
tokenizer = BertTokenizer.from_pretrained('multiberts-seed
|
46 |
model = BertModel.from_pretrained("bert-base-uncased")
|
47 |
text = "Replace me by any text you'd like."
|
48 |
encoded_input = tokenizer(text, return_tensors='pt')
|
|
|
8 |
- bookcorpus
|
9 |
- wikipedia
|
10 |
---
|
11 |
+
# MultiBERTs Seed 0 (uncased)
|
12 |
+
Seed 0 pretrained BERT model on English language using a masked language modeling (MLM) objective. It was introduced in
|
13 |
[this paper](https://arxiv.org/pdf/2106.16163.pdf) and first released in
|
14 |
[this repository](https://github.com/google-research/language/tree/master/language/multiberts). This model is uncased: it does not make a difference
|
15 |
between english and English.
|
|
|
42 |
Here is how to use this model to get the features of a given text in PyTorch:
|
43 |
```python
|
44 |
from transformers import BertTokenizer, BertModel
|
45 |
+
tokenizer = BertTokenizer.from_pretrained('multiberts-seed-'0'')
|
46 |
model = BertModel.from_pretrained("bert-base-uncased")
|
47 |
text = "Replace me by any text you'd like."
|
48 |
encoded_input = tokenizer(text, return_tensors='pt')
|