smaximo commited on
Commit
59b9606
·
1 Parent(s): 64a1c47

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +13 -12
app.py CHANGED
@@ -5,11 +5,11 @@ import torch
5
  title = "Extractive QA Biomedicine"
6
  description = """
7
  <p style="text-align: justify;">
8
- Taking into account the existence of masked language models trained on Spanish Biomedical corpus, the objective of this project is to use them to generate extractice QA models for Biomedicine and compare their effectiveness with general masked language models.
9
 
10
  The models were trained on the <a href="https://huggingface.co/datasets/squad_es">SQUAD_ES Dataset</a> (automatic translation of the Stanford Question Answering Dataset into Spanish). SQUAD v2 version was chosen in order to include questions that cannot be answered based on a provided context.
11
 
12
- The models were evaluated on <a href="https://huggingface.co/datasets/hackathon-pln-es/biomed_squad_es_v2">BIOMED_SQUAD_ES_V2 Dataset</a> , a subset of the SQUAD_ES dev dataset containing questions related to the Biomedical domain.
13
  </p>
14
  """
15
  article = """
@@ -58,23 +58,24 @@ article = """
58
  <tr>
59
  <td><a href="https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es">hackathon-pln-es/biomedtra-small-es-squad2-es</a></td>
60
  <td>Biomedical</td>
61
- <td align="right">29.6394</td>
62
- <td align="right">36.317</td>
63
- <td align="right">32.2064</td>
64
- <td align="right">45.716</td>
65
- <td align="right">27.1304</td>
66
- <td align="right">27.1304</td>
67
  </tr>
68
  </tbody></table>
69
  <h3>Conclusion and Future Work</h3>
70
- If F1 score is considered, the results show that there may be no advantage in using domain-specific masked language models to generate Biomedical QA models.
71
- In any case, the scores reported for the biomedical roberta-based models are not far below from those of the general roberta-based model.
72
 
73
- However, if only unanswerable questions are taken into account, the model with the best F1 score is hackathon-pln-es/roberta-base-biomedical-es-squad2-es.
 
74
 
75
  As future work, the following experiments could be carried out:
76
  <ul>
77
- <li>Use Biomedical masked-language models that were not trained from scratch from a Biomedical corpus but have been adapted from a general model, so as not to lose words and features of Spanish that are also present in Biomedical questions and articles.
78
  <li>Create a Biomedical training dataset with SQUAD v2 format.
79
  <li>Generate a new and larger Spanish Biomedical validation dataset, not translated from English as in the case of SQUAD_ES Dataset.
80
  <li>Ensamble different models.
 
5
  title = "Extractive QA Biomedicine"
6
  description = """
7
  <p style="text-align: justify;">
8
+ Recent research has made available Spanish Language Models trained on Biomedical corpus. This project explores the use these new models to generate extractice QA models for Biomedicine and compare their effectiveness with general masked language models.
9
 
10
  The models were trained on the <a href="https://huggingface.co/datasets/squad_es">SQUAD_ES Dataset</a> (automatic translation of the Stanford Question Answering Dataset into Spanish). SQUAD v2 version was chosen in order to include questions that cannot be answered based on a provided context.
11
 
12
+ The models were evaluated on <a href="https://huggingface.co/datasets/hackathon-pln-es/biomed_squad_es_v2">BIOMED_SQUAD_ES_V2 Dataset</a> , a subset of the SQUAD_ES evalutaion dataset containing questions related to the Biomedical domain.
13
  </p>
14
  """
15
  article = """
 
58
  <tr>
59
  <td><a href="https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es">hackathon-pln-es/biomedtra-small-es-squad2-es</a></td>
60
  <td>Biomedical</td>
61
+ <td align="right">34.4767</td>
62
+ <td align="right">44.3294</td>
63
+ <td align="right">45.3737</td>
64
+ <td align="right">65.307</td>
65
+ <td align="right">23.8261</td>
66
+ <td align="right">23.8261</td>
67
  </tr>
68
  </tbody></table>
69
  <h3>Conclusion and Future Work</h3>
70
+ If F1 Score is considered, the results show that there may be no advantage in using domain-specific masked language models to generate Biomedical QA models.
71
+ However, the F1 Scores reported for the Biomedical roberta-based models are not far below from those of the general roberta-based model.
72
 
73
+ If only unanswerable questions are taken into account, the model with the best F1 Score is <a href="https://huggingface.co/hackathon-pln-es/roberta-base-biomedical-es-squad2-es">hackathon-pln-es/roberta-base-biomedical-es-squad2-es</a>.
74
+ The model <a href="https://huggingface.co/hackathon-pln-es/biomedtra-small-es-squad2-es">hackathon-pln-es/biomedtra-small-es-squad2-es</a>, on the contrary, shows inability to correctly identify unanswerable questions.
75
 
76
  As future work, the following experiments could be carried out:
77
  <ul>
78
+ <li>Create Biomedical masked-language models adapted from a general model, to preserve words and features of Spanish that are also present in Biomedical questions and articles. The Biomedical base models used in the project were trained from scratch from a Biomedical corpus.
79
  <li>Create a Biomedical training dataset with SQUAD v2 format.
80
  <li>Generate a new and larger Spanish Biomedical validation dataset, not translated from English as in the case of SQUAD_ES Dataset.
81
  <li>Ensamble different models.