jinaai
/

jina-embedding-s-en-v1

Sentence Similarity

sentence-transformers

feature-extraction

Model card Files Files and versions

bwang0911 commited on Jul 6, 2023

Commit

95f65b5

·

1 Parent(s): f5bccc2

Update README.md

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -15,9 +15,23 @@ license: apache-2.0
 The text embedding suit trained by [Jina AI](https://github.com/jina-ai), [Finetuner team](https://github.com/jina-ai/finetuner).
-## Intented Usage
-## Model Info
 ## Data & Parameters

 The text embedding suit trained by [Jina AI](https://github.com/jina-ai), [Finetuner team](https://github.com/jina-ai/finetuner).
+## Intented Usage & Model Info
+`jina-embedding-s-en-v1` is a language model that has been trained using Jina AI's Linnaeus-Clean dataset.
+This dataset consists of 380 million pairs of sentences, which include both query-document pairs.
+These pairs were obtained from various domains and were carefully selected through a thorough cleaning process.
+The Linnaeus-Full dataset, from which the Linnaeus-Clean dataset is derived, originally contained 1.6 billion sentence pairs.
+The model has a range of use cases, including information retrieval, semantic textual similarity, text reranking, and more.
+With a compact size of just 35 million parameters,
+the model enables lightning-fast inference while still delivering impressive performance.
+Additionally, we provide the following options:
+- jina-embedding-b-en-v1: 110 million parameters.
+- jina-embedding-l-en-v1: 800 million parameters.
+- jina-embedding-xl-en-v1: 3 billion parameters.
+- jina-embedding-xxl-en-v1: 11 billion parameters.
 ## Data & Parameters