yczhou001 commited on
Commit
9f8e064
·
1 Parent(s): 1ee0ccc
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -1,3 +1,32 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ # Cross-Encoder for MS Marco
5
+
6
+ This model was trained on the [MS Marco Passage Ranking](https://github.com/microsoft/MSMARCO-Passage-Ranking) task.
7
+
8
+ The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See our paper [R2ANKER](https://arxiv.org/pdf/2206.08063.pdf) for more details.
9
+
10
+ ## Usage with Transformers
11
+
12
+ ```python
13
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
14
+ import torch
15
+ tokenizer = AutoTokenizer.from_pretrained("YCZhou/R2ANKER")
16
+ model = AutoModelForSequenceClassification.from_pretrained("YCZhou/R2ANKER")
17
+ features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt")
18
+ model.eval()
19
+ with torch.no_grad():
20
+ scores = model(**features).logits
21
+ print(scores)
22
+ ```
23
+
24
+ ## Citation
25
+ ```
26
+ @article{zhou2022towards,
27
+ title={Towards robust ranker for text retrieval},
28
+ author={Zhou, Yucheng and Shen, Tao and Geng, Xiubo and Tao, Chongyang and Xu, Can and Long, Guodong and Jiao, Binxing and Jiang, Daxin},
29
+ journal={arXiv preprint arXiv:2206.08063},
30
+ year={2022}
31
+ }
32
+ ```