deepali1021's picture
Add new SentenceTransformer model
6888589 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:48
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
base_model: Snowflake/snowflake-arctic-embed-l
widget:
  - source_sentence: >-
      What types of training did the drivers complete in the past year to
      enhance their skills?
    sentences:
      - >-
        department. It provides guidelines to ensure safe, efficient, and
        customer-focused transportation 

        services. Please read this manual carefully and consult with your
        supervisor or the department 

        manager if you have any questions or need further clarification. 
         
        Department Overview 

        The Transportation Department plays a critical role in providing
        reliable transportation services to 

        our customers. Our department consists of 50 drivers, 10 dispatchers,
        and 5 maintenance 

        technicians. In the past year, we transported over 500,000 passengers
        across various routes, ensuring 

        their safety and satisfaction. 
         
        Safety and Vehicle Maintenance 

        Safety is our top priority. All vehicles undergo regular inspections and
        maintenance to ensure they
      - >-
        Compliance with local, state, and federal regulations is crucial. Our
        drivers are required to maintain 

        up-to-date knowledge of transportation laws and regulations. In the past
        year, we conducted 20 

        compliance audits to ensure adherence to regulatory requirements. 
         
        Training and Development 

        Continuous training and development are vital for our department's
        success. In the past year, our 

        drivers completed over 100 hours of professional development training,
        focusing on defensive 

        driving, customer service, and emergency preparedness. 
         
        Communication and Collaboration 

        Effective communication and collaboration are essential within the
        Transportation Department and
      - >-
        Customer Service 

        We prioritize exceptional customer service. Our drivers are trained to
        provide a friendly and 

        respectful experience to all passengers. In the past year, we received
        an average customer 

        satisfaction rating of 4.5 out of 5, demonstrating our commitment to
        meeting customer needs and 

        exceeding their expectations. 
         
        Incident Reporting and Investigation 

        Accidents or incidents may occur during transportation operations. In
        such cases, our drivers are 

        trained to promptly report incidents to their supervisor or the incident
        response team. In the past 

        year, we reported and investigated 10 incidents, implementing corrective
        actions to prevent future 

        occurrences. 
         
        Compliance with Regulations
  - source_sentence: >-
      Who should be contacted for questions or further information regarding the
      HR Policy Manual?
    sentences:
      - >-
        responsible for familiarizing themselves with the latest version of the
        manual. 
         
        Conclusion 

        Thank you for reviewing our HR Policy Manual. It serves as a guide to
        ensure a positive and inclusive 

        work environment. If you have any questions or need further information,
        please reach out to the HR 

        department. We value your contributions and commitment to our company's
        success.
      - >-
        for familiarizing themselves with the latest version of the manual. 
         
        Conclusion 

        Thank you for reviewing the Transportation Department Policy Manual.
        Your commitment to safety, 

        customer service, and compliance plays a crucial role in our
        department's success. If you have any 

        questions or need further information, please reach out to your
        supervisor or the department 

        manager. Your dedication and professionalism are appreciated.
      - >-
        Leaves of Absence 

        We provide various types of leaves of absence, including vacation leave,
        sick leave, parental leave, 

        and bereavement leave. Employees are entitled to 15 days of paid
        vacation leave per year. The 

        average sick leave utilization in 2022 was 4.2 days per employee. We
        offer flexible parental leave 

        policies, allowing employees to take up to 12 weeks of leave after the
        birth or adoption of a child. 
         
        Compensation and Benefits 

        Our employees receive competitive compensation packages. In 2022, the
        average annual salary 

        across all positions was $60,000. We offer a comprehensive benefits
        package, including health 

        insurance, dental coverage, retirement plans, and employee assistance
        programs. On average, our
  - source_sentence: >-
      How much did the average route duration decrease in the past year due to
      route planning and optimization?
    sentences:
      - >-
        Our drivers are responsible for operating vehicles safely, following
        traffic rules and regulations. They 

        are required to hold a valid driver's license and maintain a clean
        driving record. In the past year, our 

        drivers completed over 2,000 hours of driving training to enhance their
        skills and knowledge. 
         
        Route Planning and Optimization 

        Efficient route planning is essential for timely transportation
        services. Our department utilizes 

        advanced routing software to optimize routes and minimize travel time.
        In the past year, we reduced 

        our average route duration by 15% through effective route planning and
        optimization strategies. 
         
        Customer Service
      - >-
        Our fare collection system ensures fair and consistent fee collection
        from passengers. The current fee 

        structure is as follows: 
         
        Regular fare: $2.50 

        Senior citizens and students: $1.50 

        Children under 5 years old: Free 

        Fee collection is primarily done through electronic payment methods,
        such as smart cards and 

        mobile payment apps. Drivers are responsible for ensuring correct fare
        collection and providing 

        receipts upon request. 

        Route Information and Rules 

        Our transportation department operates multiple routes within the city.
        Route information, including 

        maps, schedules, and stops, is available on our website and at
        designated information centers.
      - >-
        manual carefully and contact the HR department if you have any questions
        or need further 

        clarification. 
         
        Equal Employment Opportunity 

        Our company is committed to providing equal employment opportunities to
        all individuals. We strive 

        to create a diverse and inclusive workplace. In 2022, our workforce
        comprised 55% male and 45% 

        female employees. We actively recruit and promote individuals from
        different backgrounds, including 

        racial and ethnic minorities. Our goal is to maintain a workforce that
        reflects the diverse 

        communities we serve. 
         
        Anti-Harassment and Anti-Discrimination 

        We maintain a zero-tolerance policy for harassment and discrimination.
        In the past year, we received
  - source_sentence: How many employees are served by the organization's email system?
    sentences:
      - >-
        only two reports of harassment, which were promptly investigated and
        resolved. We provide training 

        to all employees on recognizing and preventing harassment. We encourage
        employees to report any 

        incidents of harassment or discrimination and ensure confidentiality
        throughout the investigation 

        process.
      - >-
        Passengers are expected to follow the rules and regulations while
        utilizing our transportation 

        services, including: 
         
        Boarding and exiting the vehicle in an orderly manner. 

        Yielding seats to elderly, disabled, and pregnant passengers. 

        Keeping noise levels to a minimum. 

        Refraining from eating, drinking, or smoking onboard. 

        Using designated safety equipment, such as seat belts, if available. 

        Reporting any suspicious activity or unattended items to the driver. 

        Amendments to the Policy Manual 

        This policy manual is subject to periodic review and amendments. Any
        updates or changes will be 

        communicated to employees through email or departmental meetings.
        Employees are responsible
      - >-
        Network and Systems Access 

        Access to the organization's network and systems is granted based on job
        roles and responsibilities. 

        Employees must adhere to the network access policies and protect their
        login credentials. In the past 

        year, we reviewed and updated access privileges for 300 employees to
        align with their job functions. 
         
        Email and Communication 

        The organization's email system is to be used for official communication
        purposes. Employees are 

        expected to follow email etiquette and avoid the use of offensive or
        inappropriate language. The 

        email system is monitored for security purposes and to ensure compliance
        with policies. We manage 

        and maintain an email system that serves 500 employees. 
         
        Data Security and Confidentiality
  - source_sentence: >-
      How often were departmental meetings conducted to address information
      sharing and problem-solving?
    sentences:
      - >-
        Leaves of Absence 

        We provide various types of leaves of absence, including vacation leave,
        sick leave, parental leave, 

        and bereavement leave. Employees are entitled to 15 days of paid
        vacation leave per year. The 

        average sick leave utilization in 2022 was 4.2 days per employee. We
        offer flexible parental leave 

        policies, allowing employees to take up to 12 weeks of leave after the
        birth or adoption of a child. 
         
        Compensation and Benefits 

        Our employees receive competitive compensation packages. In 2022, the
        average annual salary 

        across all positions was $60,000. We offer a comprehensive benefits
        package, including health 

        insurance, dental coverage, retirement plans, and employee assistance
        programs. On average, our
      - >-
        responsible for familiarizing themselves with the latest version of the
        manual. 
         
        Conclusion 

        Thank you for reviewing our HR Policy Manual. It serves as a guide to
        ensure a positive and inclusive 

        work environment. If you have any questions or need further information,
        please reach out to the HR 

        department. We value your contributions and commitment to our company's
        success.
      - >-
        with other departments. In the past year, we conducted monthly
        departmental meetings and 

        established communication channels to facilitate information sharing and
        problem-solving. 
         
        Fare Collection and Fee Structure
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on Snowflake/snowflake-arctic-embed-l
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 1
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 1
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33333333333333337
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.2
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.1
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 1
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 1
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 1
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 1
            name: Cosine Map@100

SentenceTransformer based on Snowflake/snowflake-arctic-embed-l

This is a sentence-transformers model finetuned from Snowflake/snowflake-arctic-embed-l. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Snowflake/snowflake-arctic-embed-l
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("deepali1021/finetuned_arctic_ft-v2")
# Run inference
sentences = [
    'How often were departmental meetings conducted to address information sharing and problem-solving?',
    'with other departments. In the past year, we conducted monthly departmental meetings and \nestablished communication channels to facilitate information sharing and problem-solving. \n \nFare Collection and Fee Structure',
    "responsible for familiarizing themselves with the latest version of the manual. \n \nConclusion \nThank you for reviewing our HR Policy Manual. It serves as a guide to ensure a positive and inclusive \nwork environment. If you have any questions or need further information, please reach out to the HR \ndepartment. We value your contributions and commitment to our company's success.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 1.0
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 1.0
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 1.0
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 1.0
cosine_mrr@10 1.0
cosine_map@100 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 48 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 48 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 11 tokens
    • mean: 16.25 tokens
    • max: 27 tokens
    • min: 31 tokens
    • mean: 99.96 tokens
    • max: 143 tokens
  • Samples:
    sentence_0 sentence_1
    What topics are covered in the Transportation Department Policy Manual? Transportation Department Policy Manual

    Table of Contents:


    Introduction

    Department Overview

    Safety and Vehicle Maintenance

    Driver Responsibilities

    Route Planning and Optimization

    Customer Service

    Incident Reporting and Investigation

    Compliance with Regulations

    Training and Development

    Communication and Collaboration

    Fare Collection and Fee Structure

    Route Information and Rules

    Amendments to the Policy Manual

    Conclusion
    Introduction
    Welcome to the Transportation Department Policy Manual! This manual serves as a comprehensive
    guide to the policies, procedures, and expectations for employees working in the transportation
    What is the purpose of the Transportation Department Policy Manual? Transportation Department Policy Manual

    Table of Contents:


    Introduction

    Department Overview

    Safety and Vehicle Maintenance

    Driver Responsibilities

    Route Planning and Optimization

    Customer Service

    Incident Reporting and Investigation

    Compliance with Regulations

    Training and Development

    Communication and Collaboration

    Fare Collection and Fee Structure

    Route Information and Rules

    Amendments to the Policy Manual

    Conclusion
    Introduction
    Welcome to the Transportation Department Policy Manual! This manual serves as a comprehensive
    guide to the policies, procedures, and expectations for employees working in the transportation
    What is the primary focus of the Transportation Department as outlined in the manual? department. It provides guidelines to ensure safe, efficient, and customer-focused transportation
    services. Please read this manual carefully and consult with your supervisor or the department
    manager if you have any questions or need further clarification.

    Department Overview
    The Transportation Department plays a critical role in providing reliable transportation services to
    our customers. Our department consists of 50 drivers, 10 dispatchers, and 5 maintenance
    technicians. In the past year, we transported over 500,000 passengers across various routes, ensuring
    their safety and satisfaction.

    Safety and Vehicle Maintenance
    Safety is our top priority. All vehicles undergo regular inspections and maintenance to ensure they
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • num_train_epochs: 10
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 10
  • per_device_eval_batch_size: 10
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step cosine_ndcg@10
1.0 5 0.9431
2.0 10 1.0
3.0 15 1.0
4.0 20 1.0
5.0 25 1.0
6.0 30 1.0
7.0 35 1.0
8.0 40 1.0
9.0 45 1.0
10.0 50 1.0

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}