finehit / README.md
Dex-X's picture
Add new SentenceTransformer model.
0f930fc verified
metadata
base_model: sentence-transformers/all-MiniLM-L12-v2
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2144
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: How do I find out when I should write my examinations?
    sentences:
      - >-
        Information relating to examination timetables is available from the
        Examination Office and will be published on the official Institute
        Notice Board and the website.
      - >-
        If you find an error on your academic record, you should contact the
        Registration and Student Records Management Office immediately.
      - >-
        To request accommodations for a disability, you must submit
        documentation of the disability to the disability services office and
        meet with a disability services coordinator.
  - source_sentence: What is the language of instruction at the Harare Institute of Technology?
    sentences:
      - English is the language of instruction.
      - >-
        Tracking international events and conference and strategically link them
        to HIT, internationalizing HIT programmes and activities, developing
        bouquet of events and activities for international visitors, helping
        affiliate, accredit HIT, staff and students to international bodies and
        associations, liaising with national bodies and promote Zimbabwean
        culture and symbols, serving as a point of contact for exchange
        students, staff and visitors, ensuring international programmes align to
        national programmes and symbols, helping affiliate HIT ethos to national
        art and culture, monitoring implementation of MoUs and MoAs,
        facilitation of international travel and visits, providing Institute
        departments with consular advice, ensuring HIT members get oriented to
        particular countries’ culture and services before departure, driving
        recruitment of foreign students and exchange programmes.
      - >-
        BFA 7206 is the course code for Financial Institutions Fraud, which is
        an elective course in the second semester of the program.
  - source_sentence: What is the process for collecting a certificate?
    sentences:
      - >-
        The programme is designed such that on completion, graduates should be
        able to innovatively execute their professional role within prescribed
        and legislative parameters, demonstrate a critical understanding and
        application of quality assurance and radiation protection in
        Radiography, apply scientific knowledge and technical skills to perform
        Radiography procedures, plan, develop and apply total quality management
        appropriate to the Radiography context, apply management,
        entrepreneurial, education and research skills independently and
        function in a supervisory clinical governance and quality assurance
        capacity within the professional sector, demonstrate the ability to
        reflect in clinical practice, critically evaluate and adjust to current
        and new trends in Radiography, demonstrate capability to implement new
        knowledge and solve problems in varying contexts, and engage life-long
        learning and development in their profession.
      - >-
        The process involves clearing any dues to the Institute and providing
        valid identification documents.
      - >-
        A student can apply for change of programme within two weeks after
        commencement of lectures.
  - source_sentence: How do I change my address or contact information?
    sentences:
      - >-
        Information Security & Assurance is a field that deals with the
        protection of information and information systems from unauthorized
        access, use, disclosure, disruption, modification, or destruction.
      - >-
        The Information and Communications Technology Services (ICTS) Department
        at HIT is responsible for providing and maintaining the Institute's IT
        infrastructure and services.
      - >-
        You can update your address or contact information through the online
        student portal or by contacting the Academic Registry.
  - source_sentence: >-
      What is the difference between Cloud Computing and Information Security &
      Assurance?
    sentences:
      - >-
        The fourth semester focuses on courses such as Research Project,
        Clinical Practice IV, and Seminar.
      - >-
        Cloud Computing is focused on the design, implementation, and management
        of cloud services, while Information Security & Assurance is focused on
        the protection of information by mitigating information risks and
        ensuring availability, privacy, and integrity of data.
      - >-
        The Applied Research Methods course is designed to equip students with
        the skills and knowledge necessary to conduct research in chemical
        engineering process and plant design.

SentenceTransformer based on sentence-transformers/all-MiniLM-L12-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L12-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L12-v2
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Dex-X/finehit")
# Run inference
sentences = [
    'What is the difference between Cloud Computing and Information Security & Assurance?',
    'Cloud Computing is focused on the design, implementation, and management of cloud services, while Information Security & Assurance is focused on the protection of information by mitigating information risks and ensuring availability, privacy, and integrity of data.',
    'The Applied Research Methods course is designed to equip students with the skills and knowledge necessary to conduct research in chemical engineering process and plant design.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,144 training samples
  • Columns: question and answer
  • Approximate statistics based on the first 1000 samples:
    question answer
    type string string
    details
    • min: 6 tokens
    • mean: 13.94 tokens
    • max: 31 tokens
    • min: 3 tokens
    • mean: 30.7 tokens
    • max: 128 tokens
  • Samples:
    question answer
    What is the role of the Dean of Students? The Dean of Students oversees various aspects of student life, including student affairs, campus life and development, accommodation, wellness, and more.
    What does the Student Affairs department do? The Student Affairs department handles matters related to student life, conduct, and welfare.
    What is the role of Campus Life and Student Development? Campus Life and Student Development is responsible for fostering a positive campus environment and promoting student growth and development.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 214 evaluation samples
  • Columns: question and answer
  • Approximate statistics based on the first 1000 samples:
    question answer
    type string string
    details
    • min: 7 tokens
    • mean: 15.12 tokens
    • max: 31 tokens
    • min: 3 tokens
    • mean: 31.14 tokens
    • max: 128 tokens
  • Samples:
    question answer
    What is Student Accommodation and Catering? Student Accommodation and Catering is a department that manages student housing and dining services.
    What certification does Mr. Njonga have from the National Social Security Authority? Safety and Health Advisor Certification
    What is the duration of the B Tech (Hons) Computer Science programme? The B Tech (Hons) Computer Science programme is a four-year full-time regular programme.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss
0.7463 100 0.5551 0.0665

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}