dangvantuan commited on
Commit
40ac503
·
verified ·
1 Parent(s): b51d203

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -9,7 +9,7 @@ tags:
9
  - sentence-embedding
10
  - mteb
11
  model-index:
12
- - name: bilingual-embedding-large-8k
13
  results:
14
  - task:
15
  type: Clustering
@@ -1527,9 +1527,9 @@ metrics:
1527
  - spearmanr
1528
  ---
1529
 
1530
- # [bilingual-embedding-large](https://huggingface.co/Lajavaness/bilingual-embedding-large)
1531
 
1532
- bilingual-embedding is the Embedding Model for bilingual language: french and english. This model is a specialized sentence-embedding trained specifically for the bilingual language, leveraging the robust capabilities of [BGE M3](https://huggingface.co/BAAI/bge-m3), a pre-trained language model larged on the [BGE M3](https://huggingface.co/BAAI/bge-m3) architecture. The model utilizes xlm-roberta to encode english-french sentences into a 1024-dimensional vector space, facilitating a wide range of applications from semantic search to text clustering. The embeddings capture the nuanced meanings of english-french sentences, reflecting both the lexical and contextual layers of the language.
1533
 
1534
 
1535
  ## Full Model Architecture
@@ -1568,7 +1568,7 @@ from sentence_transformers import SentenceTransformer
1568
 
1569
  sentences = ["Paris est une capitale de la France", "Paris is a capital of France"]
1570
 
1571
- model = SentenceTransformer('Lajavaness/bilingual-embedding-large-8k', trust_remote_code=True)
1572
  print(embeddings)
1573
 
1574
  ```
 
9
  - sentence-embedding
10
  - mteb
11
  model-index:
12
+ - name: bilingual-document-embedding
13
  results:
14
  - task:
15
  type: Clustering
 
1527
  - spearmanr
1528
  ---
1529
 
1530
+ # [bilingual-document-embedding](https://huggingface.co/Lajavaness/bilingual-document-embedding)
1531
 
1532
+ bilingual-document-embedding is the Embedding Model for document in bilingual language: french and english with context length up to 8096 tokens . This model is a specialized sentence-embedding trained specifically for the bilingual language, leveraging the robust capabilities of [BGE M3](https://huggingface.co/BAAI/bge-m3), a pre-trained language model larged on the [BGE M3](https://huggingface.co/BAAI/bge-m3) architecture. The model utilizes xlm-roberta to encode english-french sentences into a 1024-dimensional vector space, facilitating a wide range of applications from semantic search to text clustering. The embeddings capture the nuanced meanings of english-french sentences, reflecting both the lexical and contextual layers of the language.
1533
 
1534
 
1535
  ## Full Model Architecture
 
1568
 
1569
  sentences = ["Paris est une capitale de la France", "Paris is a capital of France"]
1570
 
1571
+ model = SentenceTransformer('Lajavaness/bilingual-document-embedding', trust_remote_code=True)
1572
  print(embeddings)
1573
 
1574
  ```