npip99 commited on
Commit
9aea531
·
verified ·
1 Parent(s): 9d066bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -7
README.md CHANGED
@@ -12,17 +12,22 @@ tags:
12
  - stem
13
  - medical
14
  ---
15
- # zerank-1: ZeroEntropy Inc.'s SoTA reranker
16
 
17
- <!-- Provide a quick summary of what the model is/does. -->
18
 
19
- This model is an open-weights reranker model meant to be integrated into RAG applications to rerank results from preliminary search methods such as embeddings, BM25, and hybrid search.
20
 
21
- This reranker outperforms other popular rerankers such as cohere-rerank-v3.5 and Salesforce/Llama-rank-v1 across a wide variety of task domains, including on finance, legal, code, STEM, medical, and conversational data. See [this post](https://evals_blog_post) for more details.
22
- This model is trained on an innovative multi-stage pipeline that models query-document relevance scores using adjusted Elo-like ratings. See [this post](https://technical_blog_post) and our Technical Report (Coming soon!) for more details.
23
 
24
- For this model's smaller twin, see [zerank-1-small](https://huggingface.co/zeroentropy/zerank-1-small)
25
 
 
 
 
 
 
 
 
26
 
27
  ## How to Use
28
 
@@ -41,9 +46,11 @@ scores = model.predict(query_documents)
41
  print(scores)
42
  ```
43
 
 
 
44
  ## Evaluations
45
 
46
- Comparing NDCG@10 starting from top 100 documents by embedding (using text-3-embedding-small):
47
 
48
  | Task | Embedding | cohere-rerank-v3.5 | Salesforce/Llama-rank-v1 | zerank-1-small | **zerank-1** |
49
  |----------------|-----------|--------------------|--------------------------|----------------|--------------|
 
12
  - stem
13
  - medical
14
  ---
 
15
 
16
+ <img src="https://i.imgur.com/oxvhvQu.png"/>
17
 
18
+ # Releasing zeroentropy/zerank-1
19
 
20
+ In search enginers, [rerankers are crucial](https://www.zeroentropy.dev/blog/what-is-a-reranker-and-do-i-need-one) for improving the accuracy of your retrieval system.
 
21
 
22
+ However, SOTA rerankers are closed-source and proprietary. At ZeroEntropy, we've trained a SOTA reranker outperforming closed-source competitors, and we're launching our model here on HuggingFace.
23
 
24
+ This reranker outperforms proprietary rerankers such as `cohere-rerank-v3.5` and `Salesforce/LlamaRank-v1` across a wide variety of domains, including finance, legal, code, STEM, medical, and conversational data.
25
+
26
+ At ZeroEntropy we've developed an innovative multi-stage pipeline that models query-document relevance scores as adjusted [Elo ratings](https://en.wikipedia.org/wiki/Elo_rating_system). See our Technical Report (Coming soon!) for more details.
27
+
28
+ Since we're a small company, this model is only released under a non-commercial license. If you'd like a commercial license, please contact us at [email protected] and we'll get you a license ASAP.
29
+
30
+ For this model's smaller twin, see [zerank-1-small](https://huggingface.co/zeroentropy/zerank-1-small), which we've fully open-sourced under an Apache 2.0 License.
31
 
32
  ## How to Use
33
 
 
46
  print(scores)
47
  ```
48
 
49
+ The model can also be inferenced using ZeroEntropy's [/models/rerank](https://docs.zeroentropy.dev/api-reference/models/rerank) endpoint.
50
+
51
  ## Evaluations
52
 
53
+ NDCG@10 scores between `zerank-1` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
54
 
55
  | Task | Embedding | cohere-rerank-v3.5 | Salesforce/Llama-rank-v1 | zerank-1-small | **zerank-1** |
56
  |----------------|-----------|--------------------|--------------------------|----------------|--------------|