ayaat's picture
mpnet-base-all-mqp-binary
7a698c7 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2437
  - loss:ContrastiveLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
  - source_sentence: >-
      I am having troubles and confusing moments with my body and I am scared I
      may be pregnant by my research online and I really want some advice ?
    sentences:
      - 'Does Acyclovir cause ulcers when it is prescribed for genital herpes? '
      - >-
        The confusing symptoms and online research points towards me being
        pregnant. Can I get a professional advice?
      - >-
        Do bariatric surgeries like gastric sleeve or Roux-en-Y surgery actually
        work in the long term?
  - source_sentence: >-
      It started with a headache the next day came dizziness when I move my
      eyes, soreness behind my eyes, 102 fever, slight cough. Help!
    sentences:
      - >-
        I had a headache and this was followe by dizziness on moving the eyes,
        soreness behind my eyes, high grade fever (102) and slight cough. Can
        you help me?
      - What are the signs of ovulation?
      - >-
        Why does it hurt when I shave my face? Can I do something else for it
        besides shaving in the direction of the hair growth?
  - source_sentence: How low can hemoglobin go before you need a transfusion?
    sentences:
      - >-
        I heard banana is rich in potassium. I am having diarrhea and can I take
        banana. 
      - At what Hemoglobin levels, is a blood transfusion recommended?
      - What are the symptoms of eye cancer?
  - source_sentence: >-
      I'm 5 weeks pregnant and this morning had brownish spotting, my gyn said
      this is normal and ita was due to implantation, should I be worried?
    sentences:
      - >-
        I have abdominal cramps, spotting, nause and fatigue. I am on oral
        contraceptive pills. I take them regularly. My pregnancy test is
        negative. I dont believe it is implantation as I am not pregnant. Could
        it be withdrawal bleeding or do I have an STD?
      - 'What''s best for a 1 year old, breast milk or bottle milk? '
      - >-
        I am 40, and I've had a breast lump in my right breast for about 4 years
        now. Could it be cancer?
  - source_sentence: >-
      My bm aren't solid but not quite loose. Looks more like for lack of better
      word "shredded" the why is this?
    sentences:
      - >-
        I have been taking treatment for anxiety and depression. I was given a
        new medication and have experienced heart flutters, can this medication
        cause it?
      - >-
        You might think I'm a bit paranoid but could you please help me with the
        five most common emergency surgeries in american teen girls?
      - What causes stringy and shredded stools?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - cosine_accuracy_threshold
  - cosine_f1
  - cosine_f1_threshold
  - cosine_precision
  - cosine_recall
  - cosine_ap
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
    results:
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: all mqp test
          type: all-mqp-test
        metrics:
          - type: cosine_accuracy
            value: 0.8786885245901639
            name: Cosine Accuracy
          - type: cosine_accuracy_threshold
            value: 0.7678120136260986
            name: Cosine Accuracy Threshold
          - type: cosine_f1
            value: 0.8796147672552167
            name: Cosine F1
          - type: cosine_f1_threshold
            value: 0.7446306943893433
            name: Cosine F1 Threshold
          - type: cosine_precision
            value: 0.8810289389067524
            name: Cosine Precision
          - type: cosine_recall
            value: 0.8782051282051282
            name: Cosine Recall
          - type: cosine_ap
            value: 0.9474266832530879
            name: Cosine Ap

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("mpnet-base-all-mqp-binary")
# Run inference
sentences = [
    'My bm aren\'t solid but not quite loose. Looks more like for lack of better word "shredded" the why is this?',
    'What causes stringy and shredded stools?',
    'I have been taking treatment for anxiety and depression. I was given a new medication and have experienced heart flutters, can this medication cause it?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Binary Classification

Metric Value
cosine_accuracy 0.8787
cosine_accuracy_threshold 0.7678
cosine_f1 0.8796
cosine_f1_threshold 0.7446
cosine_precision 0.881
cosine_recall 0.8782
cosine_ap 0.9474

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,437 training samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string int
    details
    • min: 7 tokens
    • mean: 26.53 tokens
    • max: 75 tokens
    • min: 7 tokens
    • mean: 28.18 tokens
    • max: 119 tokens
    • 0: ~49.00%
    • 1: ~51.00%
  • Samples:
    text1 text2 label
    I discovered I get this weakness in my hand whenever I try to snap my fingers, slight pain runs across elbow and wrist? When I try to snap my fingers there is weakness and pain across elbow and wrist? May I know what are the causes? 1
    If a mother has celiac should the daughter be tested? What is Celiac disease? 0
    Hi im 18 and I would like to know what I would use or take to get taller? Can growth hormone taken in minimal quantities increase height after 21 years in a male? 0
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 610 evaluation samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 610 samples:
    text1 text2 label
    type string string int
    details
    • min: 8 tokens
    • mean: 27.56 tokens
    • max: 70 tokens
    • min: 8 tokens
    • mean: 27.88 tokens
    • max: 91 tokens
    • 0: ~48.85%
    • 1: ~51.15%
  • Samples:
    text1 text2 label
    Okay so i'm on bc and I have had sex (it hurts) i'm bleeding brown and my vagina hurts almost itchy but it hurts? I noticed a brown discharge and itching in my vaginal area to the point that it hurts. I am also on birth control and have sexual intercourse. What do you think is causing this? 1
    I've had body aches, blocked stuffy nose, headaches, pressure in my face and throat tightness and it feels dry for 6 months is it a bad cold? For the last 6 months, I've noticed symptoms like body aches, stuffy nose, headaches, pressure sensation in the face, throat tightness and feels dry. Can a cold last this long or should I be looking for something else? 1
    Is there any way to stop my period for a little while without a prescription? Are there any natural ways to stop my period without having to visit a local doctor? 1
  • Loss: ContrastiveLoss with these parameters:
    {
        "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE",
        "margin": 0.5,
        "size_average": true
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • push_to_hub: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss all-mqp-test_cosine_ap
0.6536 100 0.0137 0.0135 -
1.0 153 - - 0.9474

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}