Sheshera Mysore commited on
Commit
2ac5213
·
1 Parent(s): 55fc5c8

Language and small clarifications.

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -35,7 +35,7 @@ The model was trained with the Adam Optimizer and a learning rate of 2e-5 with 1
35
 
36
  ### Intended uses & limitations
37
 
38
- This model is trained for document similarity tasks in biomedical scientific text using a single vector per document. Here, the documents are the title and abstract of a paper. With appropriate fine-tuning the model can also be used for other tasks such as classification. Since the training data comes primarily from biomedicine, performance on other domains may be poorer.
39
 
40
  ### How to use
41
 
@@ -56,19 +56,19 @@ clsrep = result.last_hidden_state[:,0,:]
56
  **`aspire-biencoder-biomed-scib-full`**, can be used as follows: 1) Download the [`aspire-biencoder-biomed-scib-full.zip`](https://drive.google.com/file/d/1X6S5qwaKUlI3N3RDQSG-tJCzMBWAnqxP/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
57
 
58
  ### Variable and metrics
59
- This model is evaluated on information retrieval datasets with document level queries. Here we report performance on RELISH, and TRECCOVID. These are detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). These datasets represent a abstract level retrieval task, where given a query scientific abstract the task requires the retrieval of relevant candidate abstracts.
60
 
61
  We rank documents by the L2 distance between the query and candidate documents.
62
 
63
  ### Evaluation results
64
 
65
- The released model `aspire-biencoder-biomed-scib` (and `aspire-biencoder-biomed-scib-full`) is compared against `allenai/specter`. `aspire-biencoder-biomed-scib`<sup>*</sup> is the performance reported in our paper by averaging over 3 re-runs of the model. The released models `aspire-biencoder-biomed-scib` and `aspire-biencoder-biomed-scib-full` are the single best run among the 3 re-runs.
66
 
67
  | | TRECCOVID | TRECCOVID | RELISH | RELISH |
68
  |-------------------------------------------:|:---------:|:-------:|:------:|:-------:|
69
  | | MAP | NDCG%20 | MAP | NDCG%20 |
70
  | `specter` | 28.24 | 59.28 | 60.62| 77.20 |
71
- | `aspire-biencoder-biomed-scib`<sup>*</sup> | 30.60 | 62.07 | 61.43| 78.01 |
72
  | `aspire-biencoder-biomed-scib` | 30.74 | 60.16 | 61.52| 78.07 |
73
  | `aspire-biencoder-biomed-scib-full` | 31.45 | 63.15 | 61.34| 77.89 |
74
 
 
35
 
36
  ### Intended uses & limitations
37
 
38
+ This model is trained for document similarity tasks in **biomedical** scientific text using a single vector per document. Here, the documents are the title and abstract of a paper. With appropriate fine-tuning the model can also be used for other tasks such as classification. Since the training data comes primarily from biomedicine, performance on other domains may be poorer.
39
 
40
  ### How to use
41
 
 
56
  **`aspire-biencoder-biomed-scib-full`**, can be used as follows: 1) Download the [`aspire-biencoder-biomed-scib-full.zip`](https://drive.google.com/file/d/1X6S5qwaKUlI3N3RDQSG-tJCzMBWAnqxP/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
57
 
58
  ### Variable and metrics
59
+ This model is evaluated on information retrieval datasets with document level queries. Here we report performance on RELISH (biomedical/English), and TRECCOVID (biomedical/English). These are detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). These datasets represent a abstract level retrieval task, where given a query scientific abstract the task requires the retrieval of relevant candidate abstracts.
60
 
61
  We rank documents by the L2 distance between the query and candidate documents.
62
 
63
  ### Evaluation results
64
 
65
+ The released model `aspire-biencoder-biomed-scib` (and `aspire-biencoder-biomed-scib-full`) is compared against `allenai/specter`. `aspire-biencoder-biomed-scib-full`<sup>*</sup> is the performance reported in our paper by averaging over 3 re-runs of the model. The released models `aspire-biencoder-biomed-scib` and `aspire-biencoder-biomed-scib-full` are the single best run among the 3 re-runs.
66
 
67
  | | TRECCOVID | TRECCOVID | RELISH | RELISH |
68
  |-------------------------------------------:|:---------:|:-------:|:------:|:-------:|
69
  | | MAP | NDCG%20 | MAP | NDCG%20 |
70
  | `specter` | 28.24 | 59.28 | 60.62| 77.20 |
71
+ | `aspire-biencoder-biomed-scib-full`<sup>*</sup> | 30.60 | 62.07 | 61.43| 78.01 |
72
  | `aspire-biencoder-biomed-scib` | 30.74 | 60.16 | 61.52| 78.07 |
73
  | `aspire-biencoder-biomed-scib-full` | 31.45 | 63.15 | 61.34| 77.89 |
74