|
--- |
|
base_model: |
|
- google-bert/bert-base-uncased |
|
datasets: |
|
- microsoft/ms_marco |
|
language: |
|
- en |
|
library_name: transformers |
|
license: apache-2.0 |
|
pipeline_tag: feature-extraction |
|
--- |
|
|
|
# Model Card |
|
This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364). |
|
|
|
|
|
## Model Details |
|
This is a Hypencoder Dual Enocder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model. |
|
|
|
|
|
### Model Variants |
|
We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers. |
|
|
|
| Huggingface Repo | Number of Layers | |
|
|:------------------:|:------------------:| |
|
| [jfkback/hypencoder.2_layer](https://huggingface.co/jfkback/hypencoder.2_layer) | 2 | |
|
| [jfkback/hypencoder.4_layer](https://huggingface.co/jfkback/hypencoder.4_layer) | 4 | |
|
| [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 | |
|
| [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 | |
|
|
|
## Citation |
|
**BibTeX:** |
|
``` |
|
@misc{killingback2025hypencoderhypernetworksinformationretrieval, |
|
title={Hypencoder: Hypernetworks for Information Retrieval}, |
|
author={Julian Killingback and Hansi Zeng and Hamed Zamani}, |
|
year={2025}, |
|
eprint={2502.05364}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.IR}, |
|
url={https://arxiv.org/abs/2502.05364}, |
|
} |
|
``` |