fairnlp
/

bert-cda

 ---
+language: en
 license: apache-2.0
+datasets:
+  - wikipedia
 ---
+# BERT Large Uncased (CDA) - Counterfactual Data Augmentation
+Pretrained model on English language using a masked language modeling (MLM) objective. It was introduced
+in [this paper](https://arxiv.org/abs/1810.04805) and first released
+in [this repository](https://github.com/google-research-datasets/Zari). The model is pre-trained from scratch over
+Wikipedia. Word substitutions for data augmentation are determined using the word lists provided
+at [corefBias](https://github.com/uclanlp/corefBias) ([Zhao et al. (2018)](https://arxiv.org/abs/1804.06876)).
+Disclaimer: The team releasing BERT did not write a model card for this model so this model card has been written by
+the FairNLP team.
+### BibTeX entry and citation info
+```
+@misc{zari,
+      title={Measuring and Reducing Gendered Correlations in Pre-trained Models},
+      author={Kellie Webster and Xuezhi Wang and Ian Tenney and Alex Beutel and Emily Pitler and Ellie Pavlick and Jilin Chen and Slav Petrov},
+      year={2020},
+      eprint={2010.06032},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```