|  | --- | 
					
						
						|  | language: | 
					
						
						|  | - en | 
					
						
						|  | license: apache-2.0 | 
					
						
						|  | tags: | 
					
						
						|  | - biencoder | 
					
						
						|  | - sentence-transformers | 
					
						
						|  | - text-classification | 
					
						
						|  | - sentence-pair-classification | 
					
						
						|  | - semantic-similarity | 
					
						
						|  | - semantic-search | 
					
						
						|  | - retrieval | 
					
						
						|  | - reranking | 
					
						
						|  | - generated_from_trainer | 
					
						
						|  | - dataset_size:9233417 | 
					
						
						|  | - loss:ArcFaceInBatchLoss | 
					
						
						|  | base_model: Alibaba-NLP/gte-modernbert-base | 
					
						
						|  | widget: | 
					
						
						|  | - source_sentence: Hayley Vaughan portrayed Ripa on the ABC daytime soap opera , `` | 
					
						
						|  | All My Children `` , between 1990 and 2002 . | 
					
						
						|  | sentences: | 
					
						
						|  | - Traxxpad is a music application for Sony 's PlayStation Portable published by | 
					
						
						|  | Definitive Studios and developed by Eidos Interactive . | 
					
						
						|  | - Between 1990 and 2002 , Hayley Vaughan Ripa portrayed in the ABC soap opera `` | 
					
						
						|  | All My Children `` . | 
					
						
						|  | - Between 1990 and 2002 , Ripa Hayley portrayed Vaughan in the ABC soap opera `` | 
					
						
						|  | All My Children `` . | 
					
						
						|  | - source_sentence: Olivella monilifera is a species of dwarf sea snail , small gastropod | 
					
						
						|  | mollusk in the family Olivellidae , the marine olives . | 
					
						
						|  | sentences: | 
					
						
						|  | - Olivella monilifera is a species of the dwarf - sea snail , small gastropod mollusk | 
					
						
						|  | in the Olivellidae family , the marine olives . | 
					
						
						|  | - He was cut by the Browns after being signed by the Bills in 2013 . He was later | 
					
						
						|  | released . | 
					
						
						|  | - Olivella monilifera is a kind of sea snail , marine gastropod mollusk in the Olivellidae | 
					
						
						|  | family , the dwarf olives . | 
					
						
						|  | - source_sentence: Hayashi said that Mackey `` is a sort of `` of the original model | 
					
						
						|  | for Tenchi . | 
					
						
						|  | sentences: | 
					
						
						|  | - In the summer of 2009 , Ellick shot a documentary about Malala Yousafzai . | 
					
						
						|  | - Hayashi said that Mackey is `` sort of `` the original model for Tenchi . | 
					
						
						|  | - Mackey said that Hayashi is `` sort of `` the original model for Tenchi . | 
					
						
						|  | - source_sentence: Much of the film was shot on location in Los Angeles and in nearby | 
					
						
						|  | Burbank and Glendale . | 
					
						
						|  | sentences: | 
					
						
						|  | - Much of the film was shot on location in Los Angeles and in nearby Burbank and | 
					
						
						|  | Glendale . | 
					
						
						|  | - Much of the film was shot on site in Burbank and Glendale and in the nearby Los | 
					
						
						|  | Angeles . | 
					
						
						|  | - Traxxpad is a music application for the Sony PlayStation Portable developed by | 
					
						
						|  | the Definitive Studios and published by Eidos Interactive . | 
					
						
						|  | - source_sentence: According to him , the earth is the carrier of his artistic work | 
					
						
						|  | , which is only integrated into the creative process by minimal changes . | 
					
						
						|  | sentences: | 
					
						
						|  | - National players are Bold players . | 
					
						
						|  | - According to him , earth is the carrier of his artistic work being integrated | 
					
						
						|  | into the creative process only by minimal changes . | 
					
						
						|  | - According to him , earth is the carrier of his creative work being integrated | 
					
						
						|  | into the artistic process only by minimal changes . | 
					
						
						|  | datasets: | 
					
						
						|  | - redis/langcache-sentencepairs-v2 | 
					
						
						|  | pipeline_tag: sentence-similarity | 
					
						
						|  | library_name: sentence-transformers | 
					
						
						|  | metrics: | 
					
						
						|  | - cosine_accuracy@1 | 
					
						
						|  | - cosine_precision@1 | 
					
						
						|  | - cosine_recall@1 | 
					
						
						|  | - cosine_ndcg@10 | 
					
						
						|  | - cosine_mrr@1 | 
					
						
						|  | - cosine_map@100 | 
					
						
						|  | model-index: | 
					
						
						|  | - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache | 
					
						
						|  | results: | 
					
						
						|  | - task: | 
					
						
						|  | type: information-retrieval | 
					
						
						|  | name: Information Retrieval | 
					
						
						|  | dataset: | 
					
						
						|  | name: test | 
					
						
						|  | type: test | 
					
						
						|  | metrics: | 
					
						
						|  | - type: cosine_accuracy@1 | 
					
						
						|  | value: 0.5861241448475948 | 
					
						
						|  | name: Cosine Accuracy@1 | 
					
						
						|  | - type: cosine_precision@1 | 
					
						
						|  | value: 0.5861241448475948 | 
					
						
						|  | name: Cosine Precision@1 | 
					
						
						|  | - type: cosine_recall@1 | 
					
						
						|  | value: 0.5679885764966713 | 
					
						
						|  | name: Cosine Recall@1 | 
					
						
						|  | - type: cosine_ndcg@10 | 
					
						
						|  | value: 0.7729838064849864 | 
					
						
						|  | name: Cosine Ndcg@10 | 
					
						
						|  | - type: cosine_mrr@1 | 
					
						
						|  | value: 0.5861241448475948 | 
					
						
						|  | name: Cosine Mrr@1 | 
					
						
						|  | - type: cosine_map@100 | 
					
						
						|  | value: 0.7216697804426214 | 
					
						
						|  | name: Cosine Map@100 | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | # Redis fine-tuned BiEncoder model for semantic caching on LangCache | 
					
						
						|  |  | 
					
						
						|  | This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) on the [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity. | 
					
						
						|  |  | 
					
						
						|  | ## Model Details | 
					
						
						|  |  | 
					
						
						|  | ### Model Description | 
					
						
						|  | - **Model Type:** Sentence Transformer | 
					
						
						|  | - **Base model:** [Alibaba-NLP/gte-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) <!-- at revision e7f32e3c00f91d699e8c43b53106206bcc72bb22 --> | 
					
						
						|  | - **Maximum Sequence Length:** 100 tokens | 
					
						
						|  | - **Output Dimensionality:** 768 dimensions | 
					
						
						|  | - **Similarity Function:** Cosine Similarity | 
					
						
						|  | - **Training Dataset:** | 
					
						
						|  | - [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2) | 
					
						
						|  | - **Language:** en | 
					
						
						|  | - **License:** apache-2.0 | 
					
						
						|  |  | 
					
						
						|  | ### Model Sources | 
					
						
						|  |  | 
					
						
						|  | - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) | 
					
						
						|  | - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) | 
					
						
						|  | - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) | 
					
						
						|  |  | 
					
						
						|  | ### Full Model Architecture | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  | SentenceTransformer( | 
					
						
						|  | (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'}) | 
					
						
						|  | (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) | 
					
						
						|  | ) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ## Usage | 
					
						
						|  |  | 
					
						
						|  | ### Direct Usage (Sentence Transformers) | 
					
						
						|  |  | 
					
						
						|  | First install the Sentence Transformers library: | 
					
						
						|  |  | 
					
						
						|  | ```bash | 
					
						
						|  | pip install -U sentence-transformers | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | Then you can load this model and run inference. | 
					
						
						|  | ```python | 
					
						
						|  | from sentence_transformers import SentenceTransformer | 
					
						
						|  |  | 
					
						
						|  | # Download from the 🤗 Hub | 
					
						
						|  | model = SentenceTransformer("redis/langcache-embed-v3") | 
					
						
						|  | # Run inference | 
					
						
						|  | sentences = [ | 
					
						
						|  | 'According to him , the earth is the carrier of his artistic work , which is only integrated into the creative process by minimal changes .', | 
					
						
						|  | 'According to him , earth is the carrier of his artistic work being integrated into the creative process only by minimal changes .', | 
					
						
						|  | 'According to him , earth is the carrier of his creative work being integrated into the artistic process only by minimal changes .', | 
					
						
						|  | ] | 
					
						
						|  | embeddings = model.encode(sentences) | 
					
						
						|  | print(embeddings.shape) | 
					
						
						|  | # [3, 768] | 
					
						
						|  |  | 
					
						
						|  | # Get the similarity scores for the embeddings | 
					
						
						|  | similarities = model.similarity(embeddings, embeddings) | 
					
						
						|  | print(similarities) | 
					
						
						|  | # tensor([[1.0000, 0.9961, 0.9922], | 
					
						
						|  | #         [0.9961, 1.0000, 0.9961], | 
					
						
						|  | #         [0.9922, 0.9961, 0.9961]], dtype=torch.bfloat16) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ### Direct Usage (Transformers) | 
					
						
						|  |  | 
					
						
						|  | <details><summary>Click to see the direct usage in Transformers</summary> | 
					
						
						|  |  | 
					
						
						|  | </details> | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ### Downstream Usage (Sentence Transformers) | 
					
						
						|  |  | 
					
						
						|  | You can finetune this model on your own dataset. | 
					
						
						|  |  | 
					
						
						|  | <details><summary>Click to expand</summary> | 
					
						
						|  |  | 
					
						
						|  | </details> | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ### Out-of-Scope Use | 
					
						
						|  |  | 
					
						
						|  | *List how the model may foreseeably be misused and address what users ought not to do with the model.* | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | ## Evaluation | 
					
						
						|  |  | 
					
						
						|  | ### Metrics | 
					
						
						|  |  | 
					
						
						|  | #### Information Retrieval | 
					
						
						|  |  | 
					
						
						|  | * Dataset: `test` | 
					
						
						|  | * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | 
					
						
						|  |  | 
					
						
						|  | | Metric             | Value     | | 
					
						
						|  | |:-------------------|:----------| | 
					
						
						|  | | cosine_accuracy@1  | 0.5861    | | 
					
						
						|  | | cosine_precision@1 | 0.5861    | | 
					
						
						|  | | cosine_recall@1    | 0.568     | | 
					
						
						|  | | **cosine_ndcg@10** | **0.773** | | 
					
						
						|  | | cosine_mrr@1       | 0.5861    | | 
					
						
						|  | | cosine_map@100     | 0.7217    | | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ## Bias, Risks and Limitations | 
					
						
						|  |  | 
					
						
						|  | *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ### Recommendations | 
					
						
						|  |  | 
					
						
						|  | *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | ## Training Details | 
					
						
						|  |  | 
					
						
						|  | ### Training Dataset | 
					
						
						|  |  | 
					
						
						|  | #### LangCache Sentence Pairs (all) | 
					
						
						|  |  | 
					
						
						|  | * Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2) | 
					
						
						|  | * Size: 126,938 training samples | 
					
						
						|  | * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code> | 
					
						
						|  | * Approximate statistics based on the first 1000 samples: | 
					
						
						|  | |         | anchor                                                                            | positive                                                                          | negative                                                                          | | 
					
						
						|  | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | 
					
						
						|  | | type    | string                                                                            | string                                                                            | string                                                                            | | 
					
						
						|  | | details | <ul><li>min: 8 tokens</li><li>mean: 27.27 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 27.27 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 26.54 tokens</li><li>max: 61 tokens</li></ul> | | 
					
						
						|  | * Samples: | 
					
						
						|  | | anchor                                                                                                                                      | positive                                                                                                                                      | negative                                                                                                                                      | | 
					
						
						|  | |:--------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------| | 
					
						
						|  | | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>                        | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>                           | <code>how can I get financial freedom as soon as possible?</code>                                                                             | | 
					
						
						|  | | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>                         | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>                          | <code>The older Punts are still very much in existence today and race in the same fleets as the newer boats .</code>                          | | 
					
						
						|  | | <code>Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .</code> | <code>Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada .</code> | <code>Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .</code> | | 
					
						
						|  | * Loss: <code>losses.ArcFaceInBatchLoss</code> with these parameters: | 
					
						
						|  | ```json | 
					
						
						|  | { | 
					
						
						|  | "scale": 20.0, | 
					
						
						|  | "similarity_fct": "cos_sim", | 
					
						
						|  | "gather_across_devices": false | 
					
						
						|  | } | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ### Evaluation Dataset | 
					
						
						|  |  | 
					
						
						|  | #### LangCache Sentence Pairs (all) | 
					
						
						|  |  | 
					
						
						|  | * Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2) | 
					
						
						|  | * Size: 126,938 evaluation samples | 
					
						
						|  | * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code> | 
					
						
						|  | * Approximate statistics based on the first 1000 samples: | 
					
						
						|  | |         | anchor                                                                            | positive                                                                          | negative                                                                          | | 
					
						
						|  | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | 
					
						
						|  | | type    | string                                                                            | string                                                                            | string                                                                            | | 
					
						
						|  | | details | <ul><li>min: 8 tokens</li><li>mean: 27.27 tokens</li><li>max: 49 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 27.27 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 26.54 tokens</li><li>max: 61 tokens</li></ul> | | 
					
						
						|  | * Samples: | 
					
						
						|  | | anchor                                                                                                                                      | positive                                                                                                                                      | negative                                                                                                                                      | | 
					
						
						|  | |:--------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------| | 
					
						
						|  | | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>                        | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>                           | <code>how can I get financial freedom as soon as possible?</code>                                                                             | | 
					
						
						|  | | <code>The newer punts are still very much in existence today and run in the same fleets as the older boats .</code>                         | <code>The newer Punts are still very much in existence today and race in the same fleets as the older boats .</code>                          | <code>The older Punts are still very much in existence today and race in the same fleets as the newer boats .</code>                          | | 
					
						
						|  | | <code>Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada .</code> | <code>Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada .</code> | <code>Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada .</code> | | 
					
						
						|  | * Loss: <code>losses.ArcFaceInBatchLoss</code> with these parameters: | 
					
						
						|  | ```json | 
					
						
						|  | { | 
					
						
						|  | "scale": 20.0, | 
					
						
						|  | "similarity_fct": "cos_sim", | 
					
						
						|  | "gather_across_devices": false | 
					
						
						|  | } | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ### Training Logs | 
					
						
						|  | | Epoch | Step | test_cosine_ndcg@10 | | 
					
						
						|  | |:-----:|:----:|:-------------------:| | 
					
						
						|  | | -1    | -1   | 0.7730              | | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ### Framework Versions | 
					
						
						|  | - Python: 3.12.3 | 
					
						
						|  | - Sentence Transformers: 5.1.0 | 
					
						
						|  | - Transformers: 4.56.0 | 
					
						
						|  | - PyTorch: 2.8.0+cu128 | 
					
						
						|  | - Accelerate: 1.10.1 | 
					
						
						|  | - Datasets: 4.0.0 | 
					
						
						|  | - Tokenizers: 0.22.0 | 
					
						
						|  |  | 
					
						
						|  | ## Citation | 
					
						
						|  |  | 
					
						
						|  | ### BibTeX | 
					
						
						|  |  | 
					
						
						|  | #### Sentence Transformers | 
					
						
						|  | ```bibtex | 
					
						
						|  | @inproceedings{reimers-2019-sentence-bert, | 
					
						
						|  | title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", | 
					
						
						|  | author = "Reimers, Nils and Gurevych, Iryna", | 
					
						
						|  | booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", | 
					
						
						|  | month = "11", | 
					
						
						|  | year = "2019", | 
					
						
						|  | publisher = "Association for Computational Linguistics", | 
					
						
						|  | url = "https://arxiv.org/abs/1908.10084", | 
					
						
						|  | } | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ## Glossary | 
					
						
						|  |  | 
					
						
						|  | *Clearly define terms in order to be accessible across audiences.* | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ## Model Card Authors | 
					
						
						|  |  | 
					
						
						|  | *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* | 
					
						
						|  | --> | 
					
						
						|  |  | 
					
						
						|  | <!-- | 
					
						
						|  | ## Model Card Contact | 
					
						
						|  |  | 
					
						
						|  | *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* | 
					
						
						|  | --> |