metadata
			language:
  - en
license: apache-2.0
tags:
  - biencoder
  - sentence-transformers
  - text-classification
  - sentence-pair-classification
  - semantic-similarity
  - semantic-search
  - retrieval
  - reranking
  - generated_from_trainer
  - dataset_size:9233417
  - loss:ArcFaceInBatchLoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
  - source_sentence: >-
      Hayley Vaughan portrayed Ripa on the ABC daytime soap opera , `` All My
      Children `` , between 1990 and 2002 .
    sentences:
      - >-
        Traxxpad is a music application for Sony 's PlayStation Portable
        published by Definitive Studios and developed by Eidos Interactive .
      - >-
        Between 1990 and 2002 , Hayley Vaughan Ripa portrayed in the ABC soap
        opera `` All My Children `` .
      - >-
        Between 1990 and 2002 , Ripa Hayley portrayed Vaughan in the ABC soap
        opera `` All My Children `` .
  - source_sentence: >-
      Olivella monilifera is a species of dwarf sea snail , small gastropod
      mollusk in the family Olivellidae , the marine olives .
    sentences:
      - >-
        Olivella monilifera is a species of the dwarf - sea snail , small
        gastropod mollusk in the Olivellidae family , the marine olives .
      - >-
        He was cut by the Browns after being signed by the Bills in 2013 . He
        was later released .
      - >-
        Olivella monilifera is a kind of sea snail , marine gastropod mollusk in
        the Olivellidae family , the dwarf olives .
  - source_sentence: >-
      Hayashi said that Mackey `` is a sort of `` of the original model for
      Tenchi .
    sentences:
      - >-
        In the summer of 2009 , Ellick shot a documentary about Malala Yousafzai
        .
      - >-
        Hayashi said that Mackey is `` sort of `` the original model for Tenchi
        .
      - >-
        Mackey said that Hayashi is `` sort of `` the original model for Tenchi
        .
  - source_sentence: >-
      Much of the film was shot on location in Los Angeles and in nearby Burbank
      and Glendale .
    sentences:
      - >-
        Much of the film was shot on location in Los Angeles and in nearby
        Burbank and Glendale .
      - >-
        Much of the film was shot on site in Burbank and Glendale and in the
        nearby Los Angeles .
      - >-
        Traxxpad is a music application for the Sony PlayStation Portable
        developed by the Definitive Studios and published by Eidos Interactive .
  - source_sentence: >-
      According to him , the earth is the carrier of his artistic work , which
      is only integrated into the creative process by minimal changes .
    sentences:
      - National players are Bold players .
      - >-
        According to him , earth is the carrier of his artistic work being
        integrated into the creative process only by minimal changes .
      - >-
        According to him , earth is the carrier of his creative work being
        integrated into the artistic process only by minimal changes .
datasets:
  - redis/langcache-sentencepairs-v2
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_precision@1
  - cosine_recall@1
  - cosine_ndcg@10
  - cosine_mrr@1
  - cosine_map@100
model-index:
  - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: test
          type: test
        metrics:
          - type: cosine_accuracy@1
            value: 0.5861241448475948
            name: Cosine Accuracy@1
          - type: cosine_precision@1
            value: 0.5861241448475948
            name: Cosine Precision@1
          - type: cosine_recall@1
            value: 0.5679885764966713
            name: Cosine Recall@1
          - type: cosine_ndcg@10
            value: 0.7729838064849864
            name: Cosine Ndcg@10
          - type: cosine_mrr@1
            value: 0.5861241448475948
            name: Cosine Mrr@1
          - type: cosine_map@100
            value: 0.7216697804426214
            name: Cosine Map@100
Redis fine-tuned BiEncoder model for semantic caching on LangCache
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-modernbert-base
- Maximum Sequence Length: 100 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
  (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
    'According to him , the earth is the carrier of his artistic work , which is only integrated into the creative process by minimal changes .',
    'According to him , earth is the carrier of his artistic work being integrated into the creative process only by minimal changes .',
    'According to him , earth is the carrier of his creative work being integrated into the artistic process only by minimal changes .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9961, 0.9922],
#         [0.9961, 1.0000, 0.9961],
#         [0.9922, 0.9961, 0.9961]], dtype=torch.bfloat16)
Evaluation
Metrics
Information Retrieval
- Dataset: test
- Evaluated with InformationRetrievalEvaluator
| Metric | Value | 
|---|---|
| cosine_accuracy@1 | 0.5861 | 
| cosine_precision@1 | 0.5861 | 
| cosine_recall@1 | 0.568 | 
| cosine_ndcg@10 | 0.773 | 
| cosine_mrr@1 | 0.5861 | 
| cosine_map@100 | 0.7217 | 
Training Details
Training Dataset
LangCache Sentence Pairs (all)
- Dataset: LangCache Sentence Pairs (all)
- Size: 126,938 training samples
- Columns: anchor,positive, andnegative
- Approximate statistics based on the first 1000 samples:anchor positive negative type string string string details - min: 8 tokens
- mean: 27.27 tokens
- max: 49 tokens
 - min: 8 tokens
- mean: 27.27 tokens
- max: 48 tokens
 - min: 7 tokens
- mean: 26.54 tokens
- max: 61 tokens
 
- Samples:anchor positive negative The newer Punts are still very much in existence today and race in the same fleets as the older boats .The newer punts are still very much in existence today and run in the same fleets as the older boats .how can I get financial freedom as soon as possible?The newer punts are still very much in existence today and run in the same fleets as the older boats .The newer Punts are still very much in existence today and race in the same fleets as the older boats .The older Punts are still very much in existence today and race in the same fleets as the newer boats .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .
- Loss: losses.ArcFaceInBatchLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
LangCache Sentence Pairs (all)
- Dataset: LangCache Sentence Pairs (all)
- Size: 126,938 evaluation samples
- Columns: anchor,positive, andnegative
- Approximate statistics based on the first 1000 samples:anchor positive negative type string string string details - min: 8 tokens
- mean: 27.27 tokens
- max: 49 tokens
 - min: 8 tokens
- mean: 27.27 tokens
- max: 48 tokens
 - min: 7 tokens
- mean: 26.54 tokens
- max: 61 tokens
 
- Samples:anchor positive negative The newer Punts are still very much in existence today and race in the same fleets as the older boats .The newer punts are still very much in existence today and run in the same fleets as the older boats .how can I get financial freedom as soon as possible?The newer punts are still very much in existence today and run in the same fleets as the older boats .The newer Punts are still very much in existence today and race in the same fleets as the older boats .The older Punts are still very much in existence today and race in the same fleets as the newer boats .Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada .Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .
- Loss: losses.ArcFaceInBatchLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Logs
| Epoch | Step | test_cosine_ndcg@10 | 
|---|---|---|
| -1 | -1 | 0.7730 | 
Framework Versions
- Python: 3.12.3
- Sentence Transformers: 5.1.0
- Transformers: 4.56.0
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

