File size: 1,299 Bytes
2f10e5b c6fd261 2f10e5b 4c9aea3 c6fd261 4c9aea3 72b9478 4c9aea3 72b9478 4c9aea3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
---
license: apache-2.0
datasets:
- sentence-transformers/quora-duplicates
language:
- en
base_model:
- distilbert/distilroberta-base
pipeline_tag: text-ranking
library_name: sentence-transformers
tags:
- transformers
---
# Cross-Encoder for Quora Duplicate Questions Detection
This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.
## Training Data
This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.
Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions "How to learn Java" and "How to learn Python" will result in a rather low score, as these are not duplicates.
## Usage and Performance
Pre-trained models can be used like this:
```python
from sentence_transformers import CrossEncoder
model = CrossEncoder('cross-encoder/quora-distilroberta-base')
scores = model.predict([('Question 1', 'Question 2'), ('Question 3', 'Question 4')])
```
You can use this model also without sentence_transformers and by just using Transformers ``AutoModel`` class |