jxm
/

cde-small-v2

Feature Extraction

sentence-transformers

Model card Files Files and versions Community

Jack Morris commited on Jan 16

Commit

b74493f

·

1 Parent(s): a8d6b4e

update again

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -8650,6 +8650,12 @@ model-index:
 # Contextual Document Embeddings (CDE)
 <a href="github.com/jxmorris12/cde">Github</a>
 Our new model that naturally integrates "context tokens" into the embedding process. As of January 13th, 2025, `cde-small-v2` is the best small model (under 400M params) on the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for text embedding models, with an average score of 65.58.

 # Contextual Document Embeddings (CDE)
+<div style="background-color: #f8f9fa; border-left: 6px solid #007bff; padding: 10px 20px; margin: 20px; font-family: Arial, sans-serif; line-height: 1.6;">
+    <p><strong>Note on parameter count: </strong>Although HuggingFace reports the size of this model as 281M params, it's really closer to 140M. That's because our weights actually contain the weights of two models (dubbed "first stage" and "second stage"), and only the second-stage model is used to compute embeddings at search time.</p>
+</div>
+**Note on parameter count**:
 <a href="github.com/jxmorris12/cde">Github</a>
 Our new model that naturally integrates "context tokens" into the embedding process. As of January 13th, 2025, `cde-small-v2` is the best small model (under 400M params) on the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for text embedding models, with an average score of 65.58.