lcampillos commited on
Commit
2b04062
·
verified ·
1 Parent(s): 983437e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - es
5
+ tags:
6
+ - simplification
7
+ - NER
8
+ ---
9
+
10
+ This is a model for **complex word identification (CWI)** of Spanish medical texts, based on the
11
+ [multilingual DeBERTa vs 3 (mDeBERTa)](https://huggingface.co/microsoft/mdeberta-v3-base).
12
+
13
+ The model was fine-tuned on a corpus of 225 texts for patients (162575 tokens) to identify **complex words** (**CW**).
14
+
15
+ **Results (test set)**
16
+
17
+ | Class | Precision | Recall | F1 | Accuracy |
18
+ |:-----:|:-------------:|:-------------:|:-------------:|:-------------:|
19
+ | CW | 79.05 (±1.39) | 79.01 (±0.70) | 79.02 (±0.65) | 94.86 (±0.22) |
20
+
21
+ *Results are the average of 3 experimental rounds.
22
+
23
+ If you use this model or want to have more details about the experiments and the training details, take a look at our article:
24
+
25
+ ```
26
+ @article{2025CWI,
27
+ title={Complex Word Identification for Lexical Simplification in Spanish Texts for Patients},
28
+ author={Ortega-Riba, Federico and Campillos-Llanos, Leonardo and Samy, Doaa},
29
+ journal={Procesamiento del lenguaje natural},
30
+ volume={74},
31
+ year={2025}
32
+ }
33
+ ```
34
+
35
+