Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ widget:
|
|
15 |
---
|
16 |
# DrLongformer
|
17 |
|
18 |
-
<span style="font-size:larger;">**DrLongformer**</span> is a French pretrained Longformer model based on Clinical-Longformer that was further pretrained on the NACHOS dataset (same dataset as [DrBERT](https://github.com/qanastek/DrBERT)). This model allows up to 4,096 tokens as input. DrLongformer consistently outperforms medical BERT-based models across most downstream tasks regardless of sequence length, except on NER tasks. Evaluated downstream tasks cover named entity recognition (NER), question answering (MCQA), Semantic textual similarity (STS) and text classification tasks (CLS). For more details, please refer to our paper: [Adaptation of Biomedical and Clinical Pretrained Models to French Long Documents: A Comparative Study]().
|
19 |
|
20 |
### Model pretraining
|
21 |
We explored multiple strategies for the adaptation of Longformer models to the French medical domain:
|
|
|
15 |
---
|
16 |
# DrLongformer
|
17 |
|
18 |
+
<span style="font-size:larger;">**DrLongformer**</span> is a French pretrained Longformer model based on Clinical-Longformer that was further pretrained on the NACHOS dataset (same dataset as [DrBERT](https://github.com/qanastek/DrBERT)). This model allows up to 4,096 tokens as input. DrLongformer consistently outperforms medical BERT-based models across most downstream tasks regardless of sequence length, except on NER tasks. Evaluated downstream tasks cover named entity recognition (NER), question answering (MCQA), Semantic textual similarity (STS) and text classification tasks (CLS) from [DrBenchmark](https://huggingface.co/DrBenchmark). For more details, please refer to our paper: [Adaptation of Biomedical and Clinical Pretrained Models to French Long Documents: A Comparative Study]().
|
19 |
|
20 |
### Model pretraining
|
21 |
We explored multiple strategies for the adaptation of Longformer models to the French medical domain:
|