CLARA-MeD
/

mdeberta-v3-base-finetuned-ner

Model card Files Files and versions Metrics Training metrics Community

lcampillos commited on 22 days ago

Commit

2b04062

·

verified ·

1 Parent(s): 983437e

Create README.md

Files changed (1) hide show

README.md +35 -0

README.md ADDED Viewed

	@@ -0,0 +1,35 @@

+---
+license: cc-by-nc-4.0
+language:
+- es
+tags:
+- simplification
+- NER
+---
+This is a model for **complex word identification (CWI)** of Spanish medical texts, based on the
+[multilingual DeBERTa vs 3 (mDeBERTa)](https://huggingface.co/microsoft/mdeberta-v3-base).
+The model was fine-tuned on a corpus of 225 texts for patients (162575 tokens) to identify **complex words** (**CW**).
+**Results (test set)**
+| Class |   Precision   |     Recall    |       F1      |    Accuracy   |
+|:-----:|:-------------:|:-------------:|:-------------:|:-------------:|
+|  CW   | 79.05 (±1.39) | 79.01 (±0.70) | 79.02 (±0.65) | 94.86 (±0.22) |
+*Results are the average of 3 experimental rounds.
+If you use this model or want to have more details about the experiments and the training details, take a look at our article:
+```
+@article{2025CWI,
+  title={Complex Word Identification for Lexical Simplification in Spanish Texts for Patients},
+  author={Ortega-Riba, Federico and Campillos-Llanos, Leonardo and Samy, Doaa},
+  journal={Procesamiento del lenguaje natural},
+  volume={74},
+  year={2025}
+}
+```