Sheshera Mysore
commited on
Commit
·
00f38c1
1
Parent(s):
53c0932
Usage instructions update.
Browse files
README.md
CHANGED
|
@@ -39,21 +39,7 @@ This model is trained for document similarity tasks in **computer science** scie
|
|
| 39 |
|
| 40 |
### How to use
|
| 41 |
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
```
|
| 45 |
-
from transformers import AutoModel, AutoTokenizer
|
| 46 |
-
aspire_bienc = AutoModel.from_pretrained('allenai/aspire-biencoder-compsci-spec')
|
| 47 |
-
aspire_tok = AutoTokenizer.from_pretrained('allenai/aspire-biencoder-compsci-spec')
|
| 48 |
-
title = "Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity"
|
| 49 |
-
abstract = "We present a new scientific document similarity model based on matching fine-grained aspects of texts."
|
| 50 |
-
d=[title+aspire_tok.sep_token+abstract]
|
| 51 |
-
inputs = aspire_tok(d, padding=True, truncation=True, return_tensors="pt", max_length=512)
|
| 52 |
-
result = aspire_bienc(**inputs)
|
| 53 |
-
clsrep = result.last_hidden_state[:,0,:]
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
-
**`aspire-biencoder-compsci-spec-full`**, can be used as follows: 1) Download the [`aspire-biencoder-compsci-spec-full.zip`](https://drive.google.com/file/d/1AHtzyEpyn7DeFYOdt86ik4n0tGaG5kMC/view?usp=sharing), and 2) Use it per this example usage script: [`aspire/examples/ex_aspire_bienc.py`](https://github.com/allenai/aspire/blob/main/examples/ex_aspire_bienc.py)
|
| 57 |
|
| 58 |
### Variable and metrics
|
| 59 |
This model is evaluated on information retrieval datasets with document level queries. Performance here is reported on CSFCube (computer science/English). This is detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). CSFCube presents a finer-grained query via selected sentences in a query abstract based on which a finer-grained retrieval must be made from candidate abstracts. The bi-encoder above ignores the finer grained query sentences and uses the whole abstract - this presents a baseline in the paper.
|
|
|
|
| 39 |
|
| 40 |
### How to use
|
| 41 |
|
| 42 |
+
Follow instructions for use detailed on the model github repo: https://github.com/allenai/aspire#specter-cocite
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
| 44 |
### Variable and metrics
|
| 45 |
This model is evaluated on information retrieval datasets with document level queries. Performance here is reported on CSFCube (computer science/English). This is detailed on [github](https://github.com/allenai/aspire) and in our [paper](https://arxiv.org/abs/2111.08366). CSFCube presents a finer-grained query via selected sentences in a query abstract based on which a finer-grained retrieval must be made from candidate abstracts. The bi-encoder above ignores the finer grained query sentences and uses the whole abstract - this presents a baseline in the paper.
|