ksyang commited on
Commit
5334aa7
·
1 Parent(s): d0e7094

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md CHANGED
@@ -1,3 +1,40 @@
1
  ---
2
  license: cc-by-sa-4.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-sa-4.0
3
+ language:
4
+ - ko
5
+ tags:
6
+ - korean
7
  ---
8
+
9
+ # **KoBigBird-RoBERTa-large**
10
+
11
+ This is a large-sized Korean BigBird model introduced in our [paper]() to be presented at IJCNLP-AACL 2023.
12
+ The model draws heavily from the parameters of [klue/roberta-large](https://huggingface.co/klue/roberta-large) to ensure high performance
13
+ and employs the BigBird architecture to extend its input length.
14
+ With the assistance of TAPER to extend position embeddings, the language model's extrapolation capabilities are enhanced.
15
+
16
+ ### How to Use
17
+
18
+ ```python
19
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
20
+
21
+ tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
22
+ model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")
23
+ ```
24
+
25
+ ### Hyperparameters
26
+
27
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/bhuidw3bNQZbE2tzVcZw_.png)
28
+
29
+ ### Results
30
+
31
+ Measurement on validation sets of the KLUE benchmark datasets
32
+
33
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/50jMYggkGVUM06n2v1Hxm.png)
34
+
35
+ ### Limitations
36
+ While our model achieves great results without further pretraining, direct pretraining can further refine position representations, making it even more precise.
37
+
38
+ ## Citation Information
39
+
40
+ To Be Announced