TimSchopf commited on
Commit
050697b
·
1 Parent(s): 4f00bca

update training dataset information in README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ tags:
18
  This is a pre-trained BERT language model to classify NLP-related research papers according to concepts included in the [NLP taxonomy](#nlp-taxonomy).
19
  It is a multi-label classifier that can predict concepts from all levels of the NLP taxonomy.
20
  If the model identifies a lower-level concept, it did learn to predict both the lower-level concept and its hypernyms in the NLP taxonomy.
21
- The model is fine-tuned on a weakly labeled dataset of 178,521 scientific papers from the ACL Anthology and the arXiv cs.CL domain.
22
  Prior to fine-tuning, the model is initialized with weights from [allenai/specter2](https://huggingface.co/allenai/specter2).
23
 
24
  Paper: [Exploring the Landscape of Natural Language Processing Research (RANLP 2023)](tbp).
 
18
  This is a pre-trained BERT language model to classify NLP-related research papers according to concepts included in the [NLP taxonomy](#nlp-taxonomy).
19
  It is a multi-label classifier that can predict concepts from all levels of the NLP taxonomy.
20
  If the model identifies a lower-level concept, it did learn to predict both the lower-level concept and its hypernyms in the NLP taxonomy.
21
+ The model is fine-tuned on a weakly labeled dataset of 178,521 scientific papers from the ACL Anthology, the arXiv cs.CL domain, and Scopus.
22
  Prior to fine-tuning, the model is initialized with weights from [allenai/specter2](https://huggingface.co/allenai/specter2).
23
 
24
  Paper: [Exploring the Landscape of Natural Language Processing Research (RANLP 2023)](tbp).