Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- philipphager/baidu-ultr-pretrain
|
5 |
+
- philipphager/baidu-ultr_uva-mlm-ctr
|
6 |
+
metrics:
|
7 |
+
- log-likelihood
|
8 |
+
- dcg@1
|
9 |
+
- dcg@3
|
10 |
+
- dcg@5
|
11 |
+
- dcg@10
|
12 |
+
- ndcg@10
|
13 |
+
- mrr@10
|
14 |
+
---
|
15 |
+
|
16 |
+
# Two Tower MonoBERT trained on Baidu-ULTR
|
17 |
+
A flax-based MonoBERT cross encoder trained on the [Baidu-ULTR](https://arxiv.org/abs/2207.03051) dataset with an **additivie two tower architecture** as suggested by [Yan et al](https://research.google/pubs/revisiting-two-tower-models-for-unbiased-learning-to-rank/). Similar to a position-based click model (PBM), a two tower model jointly learns item relevance (with a BERT model) and position bias (in our case using a single embedding per rank). For more info, [read our paper](https://arxiv.org/abs/2404.02543) and [find the code for this model here](https://github.com/philipphager/baidu-bert-model).
|
18 |
+
|
19 |
+
## Test Results on Baidu-ULTR Expert Annotations
|
20 |
+
|
21 |
+
|
22 |
+
|
23 |
+
|
24 |
+
## Usage
|
25 |
+
Here is an example of downloading the model and calling it for inference on a mock batch of input data. For more details on how to use the model on the Baidu-ULTR dataset, take a look at our [training](https://github.com/philipphager/baidu-bert-model/blob/main/main.py) and [evaluation scripts](https://github.com/philipphager/baidu-bert-model/blob/main/eval.py) in our code repository.
|
26 |
+
|
27 |
+
```Python
|
28 |
+
import jax.numpy as jnp
|
29 |
+
|
30 |
+
from src.model import PBMCrossEncoder
|
31 |
+
|
32 |
+
model = PBMCrossEncoder.from_pretrained(
|
33 |
+
"philipphager/baidu-ultr_uva-bert_ips-pointwise",
|
34 |
+
)
|
35 |
+
|
36 |
+
# Mock batch following Baidu-ULTR with 4 documents, each with 8 tokens
|
37 |
+
batch = {
|
38 |
+
# Query_id for each document
|
39 |
+
"query_id": jnp.array([1, 1, 1, 1]),
|
40 |
+
# Document position in SERP
|
41 |
+
"positions": jnp.array([1, 2, 3, 4]),
|
42 |
+
# Token ids for: [CLS] Query [SEP] Document
|
43 |
+
"tokens": jnp.array([
|
44 |
+
[2, 21448, 21874, 21436, 1, 20206, 4012, 2860],
|
45 |
+
[2, 21448, 21874, 21436, 1, 16794, 4522, 2082],
|
46 |
+
[2, 21448, 21874, 21436, 1, 20206, 10082, 9773],
|
47 |
+
[2, 21448, 21874, 21436, 1, 2618, 8520, 2860],
|
48 |
+
]),
|
49 |
+
# Specify if a token id belongs to the query (0) or document (1)
|
50 |
+
"token_types": jnp.array([
|
51 |
+
[0, 0, 0, 0, 1, 1, 1, 1],
|
52 |
+
[0, 0, 0, 0, 1, 1, 1, 1],
|
53 |
+
[0, 0, 0, 0, 1, 1, 1, 1],
|
54 |
+
[0, 0, 0, 0, 1, 1, 1, 1],
|
55 |
+
]),
|
56 |
+
# Marks if a token should be attended to (True) or ignored, e.g., padding tokens (False):
|
57 |
+
"attention_mask": jnp.array([
|
58 |
+
[True, True, True, True, True, True, True, True],
|
59 |
+
[True, True, True, True, True, True, True, True],
|
60 |
+
[True, True, True, True, True, True, True, True],
|
61 |
+
[True, True, True, True, True, True, True, True],
|
62 |
+
]),
|
63 |
+
}
|
64 |
+
|
65 |
+
outputs = model(batch, train=False)
|
66 |
+
print(outputs)
|
67 |
+
```
|
68 |
+
|
69 |
+
## Reference
|
70 |
+
```
|
71 |
+
@inproceedings{Hager2024BaiduULTR,
|
72 |
+
author = {Philipp Hager and Romain Deffayet and Jean-Michel Renders and Onno Zoeter and Maarten de Rijke},
|
73 |
+
title = {Unbiased Learning to Rank Meets Reality: Lessons from Baidu’s Large-Scale Search Dataset},
|
74 |
+
booktitle = {Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR`24)},
|
75 |
+
organization = {ACM},
|
76 |
+
year = {2024},
|
77 |
+
}
|
78 |
+
```
|
79 |
+
|