vaiv
/

kobigbird-roberta-large

Inference Endpoints

Model card Files Files and versions Community

ksyang commited on Sep 6, 2023

Commit

5334aa7

·

1 Parent(s): d0e7094

Update README.md

Files changed (1) hide show

README.md +37 -0

README.md CHANGED Viewed

@@ -1,3 +1,40 @@
 ---
 license: cc-by-sa-4.0
 ---

 ---
 license: cc-by-sa-4.0
+language:
+- ko
+tags:
+- korean
 ---
+# **KoBigBird-RoBERTa-large**
+This is a large-sized Korean BigBird model introduced in our [paper]() to be presented at IJCNLP-AACL 2023.
+The model draws heavily from the parameters of [klue/roberta-large](https://huggingface.co/klue/roberta-large) to ensure high performance
+and employs the BigBird architecture to extend its input length.
+With the assistance of TAPER to extend position embeddings, the language model's extrapolation capabilities are enhanced.
+### How to Use
+```python
+from transformers import AutoTokenizer, AutoModelForMaskedLM
+tokenizer = AutoTokenizer.from_pretrained("vaiv/kobigbird-roberta-large")
+model = AutoModelForMaskedLM.from_pretrained("vaiv/kobigbird-roberta-large")
+```
+### Hyperparameters
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/bhuidw3bNQZbE2tzVcZw_.png)
+### Results
+Measurement on validation sets of the KLUE benchmark datasets
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ce3886a9be5c195564fd71/50jMYggkGVUM06n2v1Hxm.png)
+### Limitations
+While our model achieves great results without further pretraining, direct pretraining can further refine position representations, making it even more precise.
+## Citation Information
+To Be Announced