splade-co-condenser-marco trained on MS MARCO hard negatives with distillation

This is a SPLADE Sparse Encoder model finetuned from Luyu/co-condenser-marco on the msmarco dataset using the sentence-transformers library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

  • Model Type: SPLADE Sparse Encoder
  • Base model: Luyu/co-condenser-marco
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 30522 dimensions
  • Similarity Function: Dot Product
  • Training Dataset:
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SparseEncoder(
  (0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False}) with MLMTransformer model: BertForMaskedLM 
  (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("arthurbresnu/co-condenser-marco-msmarco-hard-negatives")
# Run inference
queries = [
    "fastest super cars in the world",
]
documents = [
    'The McLaren F1 is amongst the fastest cars in the McLaren series and also the fastest car in the world. The McLaren F1 can clock a maximum speed of 240 miles per hour, or an equivalent of 386 km per hour.',
    'You heard about fastest cars, bikes and plans but today we have world fastest bird collection. In our collection we have top 10 fastest birds of the world. Birdâ\x80\x99s flight speed is fundamentally changeable; a hunting bird speed will increase while diving-to-catch prey as compared to its gliding speeds. Here we have the top 10 fastest birds with their flight speed. 10. Teal 109 km/h (68mph) This bird can fly 109 km/ h (68mph); they are 53 to 59cm long. This bird always lives in group. 09.',
    'Where is Langley, BC? Location of Langley on a map. Langley is a city found in British Columbia, Canada. It is located 49.08 latitude and -122.59 longitude and it is situated at elevation 78 meters above sea level. Langley has a population of 93,726 making it the 13th biggest city in British Columbia.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[35.7080, 24.5349,  3.8619]])

Evaluation

Metrics

Sparse Information Retrieval

  • Datasets: NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020
  • Evaluated with SparseInformationRetrievalEvaluator
Metric NanoMSMARCO NanoNFCorpus NanoNQ NanoClimateFEVER NanoDBPedia NanoFEVER NanoFiQA2018 NanoHotpotQA NanoQuoraRetrieval NanoSCIDOCS NanoArguAna NanoSciFact NanoTouche2020
dot_accuracy@1 0.4 0.44 0.52 0.32 0.74 0.8 0.42 0.88 0.9 0.42 0.14 0.54 0.7347
dot_accuracy@3 0.62 0.6 0.74 0.52 0.86 0.92 0.52 0.94 1.0 0.56 0.42 0.66 0.9184
dot_accuracy@5 0.68 0.64 0.78 0.54 0.9 0.94 0.58 0.96 1.0 0.74 0.58 0.74 0.9592
dot_accuracy@10 0.84 0.68 0.84 0.62 0.94 0.96 0.68 0.96 1.0 0.78 0.7 0.82 0.9592
dot_precision@1 0.4 0.44 0.52 0.32 0.74 0.8 0.42 0.88 0.9 0.42 0.14 0.54 0.7347
dot_precision@3 0.2067 0.34 0.2533 0.2 0.6133 0.32 0.2133 0.5133 0.3867 0.28 0.14 0.24 0.6259
dot_precision@5 0.136 0.316 0.16 0.14 0.588 0.204 0.168 0.34 0.248 0.252 0.116 0.168 0.5673
dot_precision@10 0.084 0.27 0.09 0.082 0.508 0.106 0.11 0.172 0.13 0.154 0.07 0.092 0.4612
dot_recall@1 0.4 0.0631 0.48 0.165 0.0764 0.7567 0.2361 0.44 0.8073 0.0877 0.14 0.52 0.0527
dot_recall@3 0.62 0.099 0.69 0.26 0.18 0.8867 0.3181 0.77 0.938 0.1727 0.42 0.65 0.1338
dot_recall@5 0.68 0.1169 0.73 0.2873 0.2374 0.92 0.3795 0.85 0.9653 0.2577 0.58 0.74 0.1941
dot_recall@10 0.84 0.1468 0.8 0.3223 0.3398 0.95 0.4829 0.86 0.98 0.3157 0.7 0.81 0.3072
dot_ndcg@10 0.6077 0.3452 0.6595 0.3037 0.6228 0.8719 0.4125 0.826 0.9411 0.3183 0.4095 0.669 0.5361
dot_mrr@10 0.5353 0.5258 0.6369 0.4207 0.8137 0.8608 0.4934 0.9117 0.9467 0.5297 0.3175 0.624 0.8255
dot_map@100 0.5419 0.1699 0.6105 0.2558 0.483 0.8427 0.3564 0.7724 0.9183 0.2456 0.3293 0.6279 0.3996
query_active_dims 54.12 51.7 53.34 135.3 52.26 79.14 54.04 68.36 57.5 73.3 281.16 109.4 56.6122
query_sparsity_ratio 0.9982 0.9983 0.9983 0.9956 0.9983 0.9974 0.9982 0.9978 0.9981 0.9976 0.9908 0.9964 0.9981
corpus_active_dims 187.6754 336.3248 223.5909 270.1291 219.799 287.1962 213.8799 223.8652 58.3902 293.6072 268.115 348.518 224.871
corpus_sparsity_ratio 0.9939 0.989 0.9927 0.9911 0.9928 0.9906 0.993 0.9927 0.9981 0.9904 0.9912 0.9886 0.9926

Sparse Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with SparseNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ]
    }
    
Metric Value
dot_accuracy@1 0.4533
dot_accuracy@3 0.6533
dot_accuracy@5 0.7
dot_accuracy@10 0.7867
dot_precision@1 0.4533
dot_precision@3 0.2667
dot_precision@5 0.204
dot_precision@10 0.148
dot_recall@1 0.3144
dot_recall@3 0.4697
dot_recall@5 0.509
dot_recall@10 0.5956
dot_ndcg@10 0.5375
dot_mrr@10 0.566
dot_map@100 0.4408
query_active_dims 53.0533
query_sparsity_ratio 0.9983
corpus_active_dims 235.2386
corpus_sparsity_ratio 0.9923

Sparse Nano BEIR

  • Dataset: NanoBEIR_mean
  • Evaluated with SparseNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "climatefever",
            "dbpedia",
            "fever",
            "fiqa2018",
            "hotpotqa",
            "msmarco",
            "nfcorpus",
            "nq",
            "quoraretrieval",
            "scidocs",
            "arguana",
            "scifact",
            "touche2020"
        ]
    }
    
Metric Value
dot_accuracy@1 0.5581
dot_accuracy@3 0.7137
dot_accuracy@5 0.7722
dot_accuracy@10 0.8292
dot_precision@1 0.5581
dot_precision@3 0.3333
dot_precision@5 0.2618
dot_precision@10 0.1792
dot_recall@1 0.325
dot_recall@3 0.4722
dot_recall@5 0.5337
dot_recall@10 0.6042
dot_ndcg@10 0.5787
dot_mrr@10 0.6494
dot_map@100 0.5041
query_active_dims 86.6795
query_sparsity_ratio 0.9972
corpus_active_dims 230.5676
corpus_sparsity_ratio 0.9924

Training Details

Training Dataset

msmarco

  • Dataset: msmarco at 9e329ed
  • Size: 90,000 training samples
  • Columns: score, query, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    score query positive negative
    type float string string string
    details
    • min: -3.66
    • mean: 12.97
    • max: 22.48
    • min: 4 tokens
    • mean: 8.89 tokens
    • max: 24 tokens
    • min: 16 tokens
    • mean: 80.61 tokens
    • max: 256 tokens
    • min: 18 tokens
    • mean: 78.92 tokens
    • max: 250 tokens
  • Samples:
    score query positive negative
    2.1688317457834883 what is ast test used for The AST test is commonly used to check for liver diseases. It is usually measured together with alanine aminotransferase (ALT). The AST to ALT ratio can help your doctor diagnose liver disease. Symptoms of liver disease that may cause your doctor to order an AST test include: 1 fatigue. 2 weakness.3 loss of appetite.t is usually measured together with alanine aminotransferase (ALT). The AST to ALT ratio can help your doctor diagnose liver disease. Symptoms of liver disease that may cause your doctor to order an AST test include: 1 fatigue. 2 weakness. 3 loss of appetite. An aspartate aminotransferase (AST) test measures the amount of this enzyme in the blood. AST is normally found in red blood cells, liver, heart, muscle tissue, pancreas, and kidneys. AST formerly was called serum glutamic oxaloacetic transaminase (SGOT).he amount of AST in the blood is directly related to the extent of the tissue damage. After severe damage, AST levels rise in 6 to 10 hours and remain high for about 4 days. The AST test may be done at the same time as a test for alanine aminotransferase, or ALT.
    12.405409197012585 what does the suspensory ligament do when the cillary muscles contract Suspensory Ligaments of the Ciliary Body: The suspensory ligaments of the ciliary body are ligaments that attach the ciliary body to the lens of the eye. Suspensory ligaments enable the ciliary body to change the shape of the lens as needed to focus light reflected from objects at different distances from the eye. Ossification of the posterior longitudinal ligament of the spine: Introduction. Ossification of the posterior longitudinal ligament of the spine: Abnormal calcification of a spinal ligament. The progressive calcification can starts within months of birth and affects the ability to move arms and legs.ssification of the posterior longitudinal ligament of the spine: Introduction. Ossification of the posterior longitudinal ligament of the spine: Abnormal calcification of a spinal ligament. The progressive calcification can starts within months of birth and affects the ability to move arms and legs.
    19.407212177912392 how many kids does trump have Donald Trump has 5 children: Donald Jr., Eric, and Ivanka- mother Ivana Trump Tiffany -mother Marla Maples Barron-mother Malania Trump Donald Trump Jr. has 2 children: … Kai Madison Trump and Donald Trump III. Copyright © 2018, Trump Make America Great Again Committee. Paid for by Trump Make America Great Again Committee, a joint fundraising committee authorized by and composed of Donald J. Trump for President, Inc. and the Republican National Committee. x Close
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMarginMSELoss",
        "lambda_corpus": 0.08,
        "lambda_query": 0.1
    }
    

Evaluation Dataset

msmarco

  • Dataset: msmarco at 9e329ed
  • Size: 10,000 evaluation samples
  • Columns: score, query, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    score query positive negative
    type float string string string
    details
    • min: -4.07
    • mean: 13.12
    • max: 22.25
    • min: 4 tokens
    • mean: 8.96 tokens
    • max: 33 tokens
    • min: 13 tokens
    • mean: 80.54 tokens
    • max: 220 tokens
    • min: 17 tokens
    • mean: 78.41 tokens
    • max: 242 tokens
  • Samples:
    score query positive negative
    11.227776050567627 tabernacle definition Wiktionary(0.00 / 0 votes)Rate this definition: tabernacle(Noun) any temporary dwelling, a hut, tent, booth. tabernacle(Noun) (Old Testament) The portable tent used before the construction of the temple, where the shekinah (presence of God) was believed to dwell. 1611 ... So Moses finished the work. Then a cloud covered the tent of the congregation, and the glory of the LORD filled the tabernacle. Both the Annunciation tabernacle in Santa Croce and the Cantoria (the singer's pulpit) in the Duomo (now in the Museo dell'Opera del Duomo) show a vastly increased repertory of forms derived from ancient art, the harvest of Donatello's long stay in Rome (1430-33).
    12.354041655858357 what scientist discovered radiation Becquerel used an apparatus similar to that displayed below to show that the radiation he discovered could not be x-rays. X-rays are neutral and cannot be bent in a magnetic field. The new radiation was bent by the magnetic field so that the radiation must be charged and different than x-rays. 5a-Hydroxy Laxogenin. 5a-Hydroxy Laxogenin was discovered by a American scientist in 1996. It was shown to possess an anabolic/androgenic ratio similar to one of the most efficient anabolic substances, in particular Anavar but without the side effects of liver toxicity or testing positive for steroidal therapy.
    11.721514344215393 are horses primates Primates still do, but many, if not most, mammals do not. Horses, deer, cows and many other mammals have a reduced number of digits on their forelimbs and hindlimbs. Primates also retain other generalized skeletal features like the clavicle or collar bone. The only primates that live in Canada are humans. The species originated in east Africa and is unrelated to South American primates. Humans first arrived in large numbers to Canada around 15,000 years ago from North Asia, and surged in migration starting 400 years ago from around the world, especially from Europe.
  • Loss: SpladeLoss with these parameters:
    {
        "loss": "SparseMarginMSELoss",
        "lambda_corpus": 0.08,
        "lambda_query": 0.1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss NanoMSMARCO_dot_ndcg@10 NanoNFCorpus_dot_ndcg@10 NanoNQ_dot_ndcg@10 NanoBEIR_mean_dot_ndcg@10 NanoClimateFEVER_dot_ndcg@10 NanoDBPedia_dot_ndcg@10 NanoFEVER_dot_ndcg@10 NanoFiQA2018_dot_ndcg@10 NanoHotpotQA_dot_ndcg@10 NanoQuoraRetrieval_dot_ndcg@10 NanoSCIDOCS_dot_ndcg@10 NanoArguAna_dot_ndcg@10 NanoSciFact_dot_ndcg@10 NanoTouche2020_dot_ndcg@10
0.0178 100 664548.88 - - - - - - - - - - - - - - -
0.0356 200 1912.7461 - - - - - - - - - - - - - - -
0.0533 300 89.4823 - - - - - - - - - - - - - - -
0.0711 400 57.4213 - - - - - - - - - - - - - - -
0.0889 500 43.5322 37.8169 0.5271 0.2411 0.5761 0.4481 - - - - - - - - - -
0.1067 600 38.8042 - - - - - - - - - - - - - - -
0.1244 700 34.1112 - - - - - - - - - - - - - - -
0.1422 800 30.3487 - - - - - - - - - - - - - - -
0.16 900 30.4368 - - - - - - - - - - - - - - -
0.1778 1000 30.9444 27.4550 0.5513 0.3375 0.6122 0.5003 - - - - - - - - - -
0.1956 1100 27.7082 - - - - - - - - - - - - - - -
0.2133 1200 28.6251 - - - - - - - - - - - - - - -
0.2311 1300 27.6298 - - - - - - - - - - - - - - -
0.2489 1400 24.1523 - - - - - - - - - - - - - - -
0.2667 1500 25.3053 23.4952 0.5898 0.3416 0.6296 0.5203 - - - - - - - - - -
0.2844 1600 24.8645 - - - - - - - - - - - - - - -
0.3022 1700 25.9037 - - - - - - - - - - - - - - -
0.32 1800 25.255 - - - - - - - - - - - - - - -
0.3378 1900 24.4475 - - - - - - - - - - - - - - -
0.3556 2000 22.8183 26.7798 0.5579 0.3407 0.6160 0.5049 - - - - - - - - - -
0.3733 2100 22.0948 - - - - - - - - - - - - - - -
0.3911 2200 22.9483 - - - - - - - - - - - - - - -
0.4089 2300 20.8408 - - - - - - - - - - - - - - -
0.4267 2400 19.5543 - - - - - - - - - - - - - - -
0.4444 2500 20.9379 18.6976 0.6327 0.3216 0.6255 0.5266 - - - - - - - - - -
0.4622 2600 20.2078 - - - - - - - - - - - - - - -
0.48 2700 20.6449 - - - - - - - - - - - - - - -
0.4978 2800 19.1764 - - - - - - - - - - - - - - -
0.5156 2900 19.4603 - - - - - - - - - - - - - - -
0.5333 3000 20.3068 18.4043 0.6081 0.3220 0.6515 0.5272 - - - - - - - - - -
0.5511 3100 19.1402 - - - - - - - - - - - - - - -
0.5689 3200 18.0542 - - - - - - - - - - - - - - -
0.5867 3300 17.9658 - - - - - - - - - - - - - - -
0.6044 3400 18.4345 - - - - - - - - - - - - - - -
0.6222 3500 19.4609 17.0769 0.6155 0.3219 0.6545 0.5306 - - - - - - - - - -
0.64 3600 17.4228 - - - - - - - - - - - - - - -
0.6578 3700 17.8939 - - - - - - - - - - - - - - -
0.6756 3800 16.2358 - - - - - - - - - - - - - - -
0.6933 3900 16.6908 - - - - - - - - - - - - - - -
0.7111 4000 15.9995 17.7298 0.6022 0.3555 0.6525 0.5367 - - - - - - - - - -
0.7289 4100 16.3495 - - - - - - - - - - - - - - -
0.7467 4200 15.559 - - - - - - - - - - - - - - -
0.7644 4300 17.4544 - - - - - - - - - - - - - - -
0.7822 4400 15.8666 - - - - - - - - - - - - - - -
0.8 4500 16.3616 18.8307 0.6036 0.3472 0.6112 0.5207 - - - - - - - - - -
0.8178 4600 15.276 - - - - - - - - - - - - - - -
0.8356 4700 15.2697 - - - - - - - - - - - - - - -
0.8533 4800 16.6727 - - - - - - - - - - - - - - -
0.8711 4900 15.2223 - - - - - - - - - - - - - - -
0.8889 5000 15.7583 16.2949 0.6177 0.3438 0.6505 0.5373 - - - - - - - - - -
0.9067 5100 15.3164 - - - - - - - - - - - - - - -
0.9244 5200 14.9429 - - - - - - - - - - - - - - -
0.9422 5300 15.5992 - - - - - - - - - - - - - - -
0.96 5400 14.8593 - - - - - - - - - - - - - - -
0.9778 5500 14.7565 16.423 0.6077 0.3452 0.6595 0.5375 - - - - - - - - - -
0.9956 5600 14.5115 - - - - - - - - - - - - - - -
-1 -1 - - 0.6077 0.3452 0.6595 0.5787 0.3037 0.6228 0.8719 0.4125 0.8260 0.9411 0.3183 0.4095 0.6690 0.5361
  • The bold row denotes the saved checkpoint.

Environmental Impact

Carbon emissions were measured using CodeCarbon.

  • Energy Consumed: 0.093 kWh
  • Carbon Emitted: 0.034 kg of CO2
  • Hours Used: 0.305 hours

Training Hardware

  • On Cloud: No
  • GPU Model: 1 x NVIDIA H100 80GB HBM3
  • CPU Model: AMD EPYC 7R13 Processor
  • RAM Size: 248.00 GB

Framework Versions

  • Python: 3.13.3
  • Sentence Transformers: 4.2.0.dev0
  • Transformers: 4.51.3
  • PyTorch: 2.7.1+cu126
  • Accelerate: 0.26.0
  • Datasets: 2.21.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

SpladeLoss

@misc{formal2022distillationhardnegativesampling,
      title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
      author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
      year={2022},
      eprint={2205.04733},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2205.04733},
}

SparseMarginMSELoss

@misc{hofstätter2021improving,
    title={Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation},
    author={Sebastian Hofstätter and Sophia Althammer and Michael Schröder and Mete Sertkan and Allan Hanbury},
    year={2021},
    eprint={2010.02666},
    archivePrefix={arXiv},
    primaryClass={cs.IR}
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
    }
Downloads last month
3
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sparse-encoder/example-splade-co-condenser-marco-msmarco-mse-margin

Finetuned
(21)
this model

Dataset used to train sparse-encoder/example-splade-co-condenser-marco-msmarco-mse-margin

Evaluation results