update training dataset information in README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ tags:
|
|
18 |
This is a pre-trained BERT language model to classify NLP-related research papers according to concepts included in the [NLP taxonomy](#nlp-taxonomy).
|
19 |
It is a multi-label classifier that can predict concepts from all levels of the NLP taxonomy.
|
20 |
If the model identifies a lower-level concept, it did learn to predict both the lower-level concept and its hypernyms in the NLP taxonomy.
|
21 |
-
The model is fine-tuned on a weakly labeled dataset of 178,521 scientific papers from the ACL Anthology
|
22 |
Prior to fine-tuning, the model is initialized with weights from [allenai/specter2](https://huggingface.co/allenai/specter2).
|
23 |
|
24 |
Paper: [Exploring the Landscape of Natural Language Processing Research (RANLP 2023)](tbp).
|
|
|
18 |
This is a pre-trained BERT language model to classify NLP-related research papers according to concepts included in the [NLP taxonomy](#nlp-taxonomy).
|
19 |
It is a multi-label classifier that can predict concepts from all levels of the NLP taxonomy.
|
20 |
If the model identifies a lower-level concept, it did learn to predict both the lower-level concept and its hypernyms in the NLP taxonomy.
|
21 |
+
The model is fine-tuned on a weakly labeled dataset of 178,521 scientific papers from the ACL Anthology, the arXiv cs.CL domain, and Scopus.
|
22 |
Prior to fine-tuning, the model is initialized with weights from [allenai/specter2](https://huggingface.co/allenai/specter2).
|
23 |
|
24 |
Paper: [Exploring the Landscape of Natural Language Processing Research (RANLP 2023)](tbp).
|