AkshaySandbox's picture
Update README.md
1a5d9be verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1602
  - loss:CosineSimilarityLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
  - source_sentence: >-
      Has there been any recent discussion on the trend of women choosing to
      become mothers later in life?
    sentences:
      - >-
        British Columbia, Ontario, New Brunswick, Nova Scotia, and Prince Edward
        Island have fully implemented universal hearing screening programs.
      - >-
        If the first readings exceed the maximum allowable difference,
        measurements are taken for a second and, if necessary, a third time.
      - >-
        In recent years, practices have shifted and these professionals are now
        able to observe, assess, and consult on the child’s program at the
        centre rather than in an office visit.
  - source_sentence: >-
      Where can I find more information on facilitating extra-provincial ward
      adoptions in British Columbia?
    sentences:
      - >-
        You can refer to Practice Directive #2021-01 for more information on
        facilitating extra-provincial ward adoptions in British Columbia.
      - >-
        No, a Care Plan is not required if the child/youth has no special
        service needs.
      - >-
        Licensed ECEC programs may fall under the responsibility of one or more
        ministries and departments, including education, health, family, and/or
        social services.
  - source_sentence: What should be done if there are minor differences in openness requests?
    sentences:
      - >-
        If there are minor differences, it is advised to try to reach an
        acceptable compromise in a meeting.
      - >-
        Yes, the adoption can be completed in B.C. even if the child is from
        another province. However, the originating provincial or territorial
        child welfare authority is responsible for finalizing the adoption.
      - >-
        The new standards establish the breastfed child as the normative model
        for child growth and development.
  - source_sentence: >-
      Does the federal government in Canada manage Early Childhood Education and
      Care (ECEC)?
    sentences:
      - You can call a friend or relative to ask for help.
      - >-
        You can start introducing common food allergens to your baby as they
        begin eating solid foods. It's best to introduce them one at a time.
      - >-
        A search of the Parents' Registry should be requested at the time the
        child or youth is registered with the Adoption and Permanency Branch.
  - source_sentence: How can I order a birth certificate in British Columbia?
    sentences:
      - >-
        The Hague Convention is an international treaty that sets standards to
        ensure that the best interests of children and youth are protected.
      - >-
        Only the consents of the Director of Adoption and the child/youth aged
        12 or over are required.
      - >-
        The L value is -0.4488, the M value is 15.2759, and the S value is
        0.08380.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: pregnancy val
          type: pregnancy_val
        metrics:
          - type: pearson_cosine
            value: 0.9454219117248748
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8647267521805166
            name: Spearman Cosine

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'How can I order a birth certificate in British Columbia?',
    'Only the consents of the Director of Adoption and the child/youth aged 12 or over are required.',
    'The L value is -0.4488, the M value is 15.2759, and the S value is 0.08380.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9454
spearman_cosine 0.8647

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,602 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 7 tokens
    • mean: 16.16 tokens
    • max: 34 tokens
    • min: 9 tokens
    • mean: 28.61 tokens
    • max: 75 tokens
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    What kind of hearing screening programs do other provinces and territories in Canada have? Unless an adoption placement has already been secured and a brief interim placement with a caregiver is required, the child should be 6 months of age or younger. 0.0
    Are there resources available for children with learning disabilities in early childhood programs? Yes, most PTs dedicate resources, programs or staff to support children with learning disabilities and other special needs. 1.0
    What is parental leave? Parental leave is a type of benefit that allows parents to take time off work after the birth or adoption of a child. The text mentions it but does not provide specific details about the duration or requirements in Canada. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step pregnancy_val_spearman_cosine
1.0 101 0.8647

Framework Versions

  • Python: 3.13.1
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0
  • Accelerate: 1.4.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}