xpqiu commited on
Commit
487dc54
·
1 Parent(s): 29e00a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -37
README.md CHANGED
@@ -1,37 +0,0 @@
1
- # Chinese BART-Base
2
-
3
- ## Model description
4
-
5
- This is an implementation of CPT-Base. To use CPT, please import the file `modeling_cpt.py` (**Download** [Here](https://github.com/fastnlp/CPT/blob/master/finetune/modeling_cpt.py)) that define the architecture of CPT into your project.
6
-
7
- [**CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation**](https://arxiv.org/pdf/2109.05729.pdf)
8
-
9
- Yunfan Shao, Zhichao Geng, Yitao Liu, Junqi Dai, Fei Yang, Li Zhe, Hujun Bao, Xipeng Qiu
10
-
11
- **Github Link:** https://github.com/fastnlp/CPT
12
-
13
- ## Usage
14
-
15
- ```python
16
- >>> from modeling_cpt import CPTForConditionalGeneration
17
- >>> from transformers import BertTokenizer
18
- >>> tokenizer = BertTokenizer.from_pretrained("fnlp/cpt-base")
19
- >>> model = CPTForConditionalGeneration.from_pretrained("fnlp/cpt-base")
20
- >>> inputs = tokenizer.encode("北京是[MASK]的首都", return_tensors='pt')
21
- >>> pred_ids = model.generate(input_ids, num_beams=4, max_length=20)
22
- >>> print(tokenizer.convert_ids_to_tokens(pred_ids[i]))
23
- ['[SEP]', '[CLS]', '北', '京', '是', '中', '国', '的', '首', '都', '[SEP]']
24
- ```
25
-
26
- **Note: Please use BertTokenizer for the model vocabulary. DO NOT use original BartTokenizer.**
27
-
28
- ## Citation
29
-
30
- ```bibtex
31
- @article{shao2021cpt,
32
- title={CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation},
33
- author={Yunfan Shao and Zhichao Geng and Yitao Liu and Junqi Dai and Fei Yang and Li Zhe and Hujun Bao and Xipeng Qiu},
34
- journal={arXiv preprint arXiv:2109.05729},
35
- year={2021}
36
- }
37
- ```