langcache-embed-v3 / README.md
radoslavralev's picture
Add new SentenceTransformer model
91abacf verified
metadata
language:
  - en
license: apache-2.0
tags:
  - biencoder
  - sentence-transformers
  - text-classification
  - sentence-pair-classification
  - semantic-similarity
  - semantic-search
  - retrieval
  - reranking
  - generated_from_trainer
  - dataset_size:1460771
  - loss:ArcFaceInBatchLoss
base_model: Alibaba-NLP/gte-modernbert-base
widget:
  - source_sentence: >-
      "How much would I need to narrate a ""Let's Play"" video in order to make
      money from it on YouTube?"
    sentences:
      - How much money do people make from YouTube videos with 1 million views?
      - >-
        "How much would I need to narrate a ""Let's Play"" video in order to
        make money from it on YouTube?"
      - '"Does the sentence, ""I expect to be disappointed,"" make sense?"'
  - source_sentence: '"I appreciate that.'
    sentences:
      - >-
        "How is the Mariner rewarded in ""The Rime of the Ancient Mariner"" by
        Samuel Taylor Coleridge?"
      - '"I appreciate that.'
      - I can appreciate that.
  - source_sentence: >-
      """It is very easy to defeat someone, but too hard to win some one"". What
      does the previous sentence mean?"
    sentences:
      - '"How can you use the word ""visceral"" in a sentence?"'
      - >-
        """It is very easy to defeat someone, but too hard to win some one"".
        What does the previous sentence mean?"
      - >-
        "What does ""The loudest one in the room is the weakest one in the
        room."" Mean?"
  - source_sentence: >-
      " We condemn this raid which is in our view illegal and morally and
      politically unjustifiable , " London-based NCRI official Ali Safavi told
      Reuters by telephone .
    sentences:
      - >-
        London-based NCRI official Ali Safavi told Reuters : " We condemn this
        raid , which is in our view illegal and morally and politically
        unjustifiable . "
      - >-
        The social awkwardness is complicated by the fact that Marianne is a
        white girl living with a black family .
      - art's cause, this in my opinion
  - source_sentence: >-
      "If you click ""like"" on an old post that someone made on your wall yet
      you're no longer Facebook friends, will they still receive a
      notification?"
    sentences:
      - >-
        "Is there is any two wheeler having a gear box which has the feature
        ""automatic neutral"" when the engine is off while it is in gear?"
      - >-
        "If you click ""like"" on an old post that someone made on your wall yet
        you're no longer Facebook friends, will they still receive a
        notification?"
      - >-
        "If your teenage son posted ""La commedia e finita"" on his Facebook
        wall, would you be concerned?"
datasets:
  - redis/langcache-sentencepairs-v2
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_precision@1
  - cosine_recall@1
  - cosine_ndcg@10
  - cosine_mrr@1
  - cosine_map@100
  - cosine_auc_precision_cache_hit_ratio
  - cosine_auc_similarity_distribution
model-index:
  - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache
    results:
      - task:
          type: custom-information-retrieval
          name: Custom Information Retrieval
        dataset:
          name: test
          type: test
        metrics:
          - type: cosine_accuracy@1
            value: 0.5880558568329718
            name: Cosine Accuracy@1
          - type: cosine_precision@1
            value: 0.5880558568329718
            name: Cosine Precision@1
          - type: cosine_recall@1
            value: 0.5707119922832199
            name: Cosine Recall@1
          - type: cosine_ndcg@10
            value: 0.771771481653434
            name: Cosine Ndcg@10
          - type: cosine_mrr@1
            value: 0.5880558568329718
            name: Cosine Mrr@1
          - type: cosine_map@100
            value: 0.7214095423928245
            name: Cosine Map@100
          - type: cosine_auc_precision_cache_hit_ratio
            value: 0.35287530778716975
            name: Cosine Auc Precision Cache Hit Ratio
          - type: cosine_auc_similarity_distribution
            value: 0.16742922746173
            name: Cosine Auc Similarity Distribution

Redis fine-tuned BiEncoder model for semantic caching on LangCache

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-modernbert-base on the LangCache Sentence Pairs (all) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (mlp_hidden): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.ReLU'})
  (mlp_out): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("redis/langcache-embed-v3")
# Run inference
sentences = [
    '"If you click ""like"" on an old post that someone made on your wall yet you\'re no longer Facebook friends, will they still receive a notification?"',
    '"If you click ""like"" on an old post that someone made on your wall yet you\'re no longer Facebook friends, will they still receive a notification?"',
    '"If your teenage son posted ""La commedia e finita"" on his Facebook wall, would you be concerned?"',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 1.0000, 0.2617],
#         [1.0000, 1.0000, 0.2617],
#         [0.2617, 0.2617, 1.0000]])

Evaluation

Metrics

Custom Information Retrieval

  • Dataset: test
  • Evaluated with ir_evaluator.CustomInformationRetrievalEvaluator
Metric Value
cosine_accuracy@1 0.5881
cosine_precision@1 0.5881
cosine_recall@1 0.5707
cosine_ndcg@10 0.7718
cosine_mrr@1 0.5881
cosine_map@100 0.7214
cosine_auc_precision_cache_hit_ratio 0.3529
cosine_auc_similarity_distribution 0.1674

Training Details

Training Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 132,354 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 25.33 tokens
    • max: 100 tokens
    • min: 4 tokens
    • mean: 24.98 tokens
    • max: 100 tokens
    • min: 5 tokens
    • mean: 19.06 tokens
    • max: 68 tokens
  • Samples:
    anchor positive negative
    What high potential jobs are there other than computer science? What high potential jobs are there other than computer science? Why IT or Computer Science jobs are being over rated than other Engineering jobs?
    Would India ever be able to develop a missile system like S300 or S400 missile? Would India ever be able to develop a missile system like S300 or S400 missile? Should India buy the Russian S400 air defence missile system?
    water from the faucet is being drunk by a yellow dog A yellow dog is drinking water from the faucet Childlessness is low in Eastern European countries.
  • Loss: losses.ArcFaceInBatchLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

LangCache Sentence Pairs (all)

  • Dataset: LangCache Sentence Pairs (all)
  • Size: 132,354 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 25.33 tokens
    • max: 100 tokens
    • min: 4 tokens
    • mean: 24.98 tokens
    • max: 100 tokens
    • min: 5 tokens
    • mean: 19.06 tokens
    • max: 68 tokens
  • Samples:
    anchor positive negative
    What high potential jobs are there other than computer science? What high potential jobs are there other than computer science? Why IT or Computer Science jobs are being over rated than other Engineering jobs?
    Would India ever be able to develop a missile system like S300 or S400 missile? Would India ever be able to develop a missile system like S300 or S400 missile? Should India buy the Russian S400 air defence missile system?
    water from the faucet is being drunk by a yellow dog A yellow dog is drinking water from the faucet Childlessness is low in Eastern European countries.
  • Loss: losses.ArcFaceInBatchLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 8192
  • per_device_eval_batch_size: 8192
  • gradient_accumulation_steps: 2
  • weight_decay: 0.001
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • num_train_epochs: 1
  • warmup_ratio: 0.05
  • bf16: True
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: 4
  • load_best_model_at_end: True
  • optim: stable_adamw
  • ddp_find_unused_parameters: False
  • dataloader_persistent_workers: True
  • push_to_hub: True
  • hub_model_id: redis/langcache-embed-v3
  • eval_on_start: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8192
  • per_device_eval_batch_size: 8192
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.001
  • adam_beta1: 0.9
  • adam_beta2: 0.98
  • adam_epsilon: 1e-06
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: 4
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: stable_adamw
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: False
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: True
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: redis/langcache-embed-v3
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Validation Loss test_cosine_ndcg@10
0 0 2.9916 0.7718

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}