metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:2437
- loss:ContrastiveLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: >-
I am having troubles and confusing moments with my body and I am scared I
may be pregnant by my research online and I really want some advice ?
sentences:
- 'Does Acyclovir cause ulcers when it is prescribed for genital herpes? '
- >-
The confusing symptoms and online research points towards me being
pregnant. Can I get a professional advice?
- >-
Do bariatric surgeries like gastric sleeve or Roux-en-Y surgery actually
work in the long term?
- source_sentence: >-
It started with a headache the next day came dizziness when I move my
eyes, soreness behind my eyes, 102 fever, slight cough. Help!
sentences:
- >-
I had a headache and this was followe by dizziness on moving the eyes,
soreness behind my eyes, high grade fever (102) and slight cough. Can
you help me?
- What are the signs of ovulation?
- >-
Why does it hurt when I shave my face? Can I do something else for it
besides shaving in the direction of the hair growth?
- source_sentence: How low can hemoglobin go before you need a transfusion?
sentences:
- >-
I heard banana is rich in potassium. I am having diarrhea and can I take
banana.
- At what Hemoglobin levels, is a blood transfusion recommended?
- What are the symptoms of eye cancer?
- source_sentence: >-
I'm 5 weeks pregnant and this morning had brownish spotting, my gyn said
this is normal and ita was due to implantation, should I be worried?
sentences:
- >-
I have abdominal cramps, spotting, nause and fatigue. I am on oral
contraceptive pills. I take them regularly. My pregnancy test is
negative. I dont believe it is implantation as I am not pregnant. Could
it be withdrawal bleeding or do I have an STD?
- 'What''s best for a 1 year old, breast milk or bottle milk? '
- >-
I am 40, and I've had a breast lump in my right breast for about 4 years
now. Could it be cancer?
- source_sentence: >-
My bm aren't solid but not quite loose. Looks more like for lack of better
word "shredded" the why is this?
sentences:
- >-
I have been taking treatment for anxiety and depression. I was given a
new medication and have experienced heart flutters, can this medication
cause it?
- >-
You might think I'm a bit paranoid but could you please help me with the
five most common emergency surgeries in american teen girls?
- What causes stringy and shredded stools?
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
model-index:
- name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
results:
- task:
type: binary-classification
name: Binary Classification
dataset:
name: all mqp test
type: all-mqp-test
metrics:
- type: cosine_accuracy
value: 0.8786885245901639
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.7678120136260986
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.8796147672552167
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7446306943893433
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.8810289389067524
name: Cosine Precision
- type: cosine_recall
value: 0.8782051282051282
name: Cosine Recall
- type: cosine_ap
value: 0.9474266832530879
name: Cosine Ap
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("mpnet-base-all-mqp-binary")
# Run inference
sentences = [
'My bm aren\'t solid but not quite loose. Looks more like for lack of better word "shredded" the why is this?',
'What causes stringy and shredded stools?',
'I have been taking treatment for anxiety and depression. I was given a new medication and have experienced heart flutters, can this medication cause it?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Binary Classification
- Dataset:
all-mqp-test
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.8787 |
cosine_accuracy_threshold | 0.7678 |
cosine_f1 | 0.8796 |
cosine_f1_threshold | 0.7446 |
cosine_precision | 0.881 |
cosine_recall | 0.8782 |
cosine_ap | 0.9474 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 2,437 training samples
- Columns:
text1
,text2
, andlabel
- Approximate statistics based on the first 1000 samples:
text1 text2 label type string string int details - min: 7 tokens
- mean: 26.53 tokens
- max: 75 tokens
- min: 7 tokens
- mean: 28.18 tokens
- max: 119 tokens
- 0: ~49.00%
- 1: ~51.00%
- Samples:
text1 text2 label I discovered I get this weakness in my hand whenever I try to snap my fingers, slight pain runs across elbow and wrist?
When I try to snap my fingers there is weakness and pain across elbow and wrist? May I know what are the causes?
1
If a mother has celiac should the daughter be tested?
What is Celiac disease?
0
Hi im 18 and I would like to know what I would use or take to get taller?
Can growth hormone taken in minimal quantities increase height after 21 years in a male?
0
- Loss:
ContrastiveLoss
with these parameters:{ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE", "margin": 0.5, "size_average": true }
Evaluation Dataset
Unnamed Dataset
- Size: 610 evaluation samples
- Columns:
text1
,text2
, andlabel
- Approximate statistics based on the first 610 samples:
text1 text2 label type string string int details - min: 8 tokens
- mean: 27.56 tokens
- max: 70 tokens
- min: 8 tokens
- mean: 27.88 tokens
- max: 91 tokens
- 0: ~48.85%
- 1: ~51.15%
- Samples:
text1 text2 label Okay so i'm on bc and I have had sex (it hurts) i'm bleeding brown and my vagina hurts almost itchy but it hurts?
I noticed a brown discharge and itching in my vaginal area to the point that it hurts. I am also on birth control and have sexual intercourse. What do you think is causing this?
1
I've had body aches, blocked stuffy nose, headaches, pressure in my face and throat tightness and it feels dry for 6 months is it a bad cold?
For the last 6 months, I've noticed symptoms like body aches, stuffy nose, headaches, pressure sensation in the face, throat tightness and feels dry. Can a cold last this long or should I be looking for something else?
1
Is there any way to stop my period for a little while without a prescription?
Are there any natural ways to stop my period without having to visit a local doctor?
1
- Loss:
ContrastiveLoss
with these parameters:{ "distance_metric": "SiameseDistanceMetric.COSINE_DISTANCE", "margin": 0.5, "size_average": true }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 1warmup_ratio
: 0.1fp16
: Truepush_to_hub
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | Validation Loss | all-mqp-test_cosine_ap |
---|---|---|---|---|
0.6536 | 100 | 0.0137 | 0.0135 | - |
1.0 | 153 | - | - | 0.9474 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.3.1
- Transformers: 4.47.1
- PyTorch: 2.6.0+cu124
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ContrastiveLoss
@inproceedings{hadsell2006dimensionality,
author={Hadsell, R. and Chopra, S. and LeCun, Y.},
booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
title={Dimensionality Reduction by Learning an Invariant Mapping},
year={2006},
volume={2},
number={},
pages={1735-1742},
doi={10.1109/CVPR.2006.100}
}