metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:182
- loss:CosineSimilarityLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
- source_sentence: What documents must contractors/vendors provide?
sentences:
- >-
1. ESH representatives will carry out the training when new employees
need to be trained, or on an annual basis.
- >-
1. Safe Operating Procedure (SOP).
2. Risk Assessment ( Hazard Identification, Risk Assessment, & Risk
control / HIRARC) / JSA / Job Safety Analysis.
3. Valid licenses (If applicable).
4. Certification of Fitness-CF (For all types of cranes).
5. Crane Operator Competency License. (If applicable).
6. All scaffolding must be erected as per the statutory regulations.
7. Lifting Supervisor Competency Certificate. (If applicable).
8. Signal Man Competency Certificate. (If applicable.
9. Rigger Competency Certificate. (If applicable).
10. Lifting plan (If applicable).
11. Scaffolder Level 1/2/3 Certificate. (If applicable).
- >-
1. To ensure the specific employees are aware of the correct procedures
associated with chemical handling and waste management.
- source_sentence: What is the guideline for shirts and blouses?
sentences:
- >-
1. ESH representatives will carry out the training when new employees
need to be trained, or on an annual basis.
- 1. Employees in CLEAN ROOM are NOT ALLOWED to use/wear makeup/bangles.
- |-
1. 1. Formal or casual shirts with sleeves.
2. 2. Collared T-shirts and blouses/sleeveless tops (for ladies).
3. 3. Round-neck T-shirts are allowed for non-office personnel.
4. 4. Clothing with the company logo is encouraged.
5. 5. Sport Team.
6. 6. University.
7. 7. Fashion brands on clothing are generally acceptable.
- source_sentence: >-
What is the lunch schedule for the 1st shift in the normal schedule in
M-site?
sentences:
- 12 days.
- >-
1. Categorization of Machine: Identify the location of the machine, its
function, and all necessary items needed for it to run (e.g.,
lubricants, saw blades, etc).
2. Authorization: Ensure that all personnel operating the machine have
received the appropriate training.
3. Hazard & Risks associated with
equipment/machinery/techniques/process: Identify all hazards and risks
associated, and implement sufficient controls according to the hierarchy
of controls (e.g., warning labels and symbols).
4. Pre-work procedure: Ensure that the machine is in proper, running
condition before starting work.
5. During work procedure: Follow the correct standard operating
procedure for carrying out that work activity.
6. After work procedure: Ensure that the machine remains in a neat and
tidy condition at all times.
7. Work Area: Identify the area where the work is being done.
8. PPE: Ensure that appropriate PPE is available for all personnel
handling the machine.
9. Emergency Procedure: Ensure sufficient emergency features are
available on the machine (e.g., emergency stop button).
10. After work hour: Ensure the machine system is in shutdown/standby
mode when the machine is not running.
11. Housekeeping: Ensure basic housekeeping is done at the work area.
12. Scheduled waste: Any scheduled waste generated by the process should
be disposed of according to Carsem waste management procedure.
- >-
1. Lunch (Tengah Hari) for the 1st shift is from 12:00 PM to 1:00 PM,
lasting 60 minutes.
- source_sentence: What is the meal schedule for M-site?
sentences:
- 2 days.
- >-
1. 1st Shift: -Dinner (Malam): 8:00PM - 8:40PM, -Supper(Lewat Malam):
1:00AM - 1:30 AM -Breakfast(Pagi): 8:00AM - 8:30AM -Lunch(Tengah Hari):
12:50PM - 1:30PM.
2. 2nd Shift: -Dinner(Malam): 8:50PM - 9:30PM -Supper(Lewat Malam):
1:40AM - 2:10AM -Breakfast(Pagi): 8:40AM - 9:10AM -Lunch(Tengah Hari):
1:40PM - 2:20PM.
3. 3rd Shift: -Dinner(Malam): 9:40PM - 10:20PM -Supper(Lewat Malam):
2:20AM - 2:50AM -Breakfast(Pagi): 9:20AM - 9:50AM -Lunch(Tengah Hari):
2:30PM - 3:10PM.
4. 4th Shift: -Dinner(Malam): 10:30PM - 11:10PM -Supper(Lewat Malam):
3:00AM - 3:30AM -Breakfast(Pagi): 10:00AM - 10:30AM -Lunch(Tengah Hari):
3:20PM - 4:00PM.
- >-
1. The mechanical safety guidelines include:
2. 1. Lock-Out Tag-Out (LOTO): Always practice LOTO procedures when
performing maintenance or repairs on machines.
3. 2. Preventive Maintenance: Conduct regular preventive maintenance on
all machinery to ensure proper functioning.
4. 3. Pinch Points Awareness: Identify all possible pinch points on
machinery, and ensure they are properly labeled.
5. 4. Production Area Organization: Keep the production area neat and
organized at all times.
6. 5. Operator Training: Provide adequate training to operators before
allowing them to handle machines.
7. 6. Machine Guarding: Ensure all safety guards are in place before
starting machine operations.
- source_sentence: Can employees wear traditional attire?
sentences:
- >-
1. N03 : Monday to Friday, 8am to 5:30pm.
2. N04 : Tuesday to Saturday, 8am to 5:30pm.
3. N05 : Monday to Friday, 8:30am to 6pm.
4. N06 : Monday to Friday, 9am to 6:30pm.
5. N07 : Tuesday to Saturday, 8:30am to 6pm.
6. N08 : Tuesday to Saturday, 9am to 6.30pm.
7. N6 : Tuesday to Saturday, 8:30pm to 6:15pm.
8. N9: 5 working days 2 days off, 7:30am to 5:15pm , 10:30am to 8:15pm.
9. N10: 5 working days 2 days off, 10:30am to 8:15pm , 7:30am to 5:15pm.
10. AA/BB/CC/A/B/C : 4 working days 2 days off, 6:30am to 6:30pm ,
6:30pm to 6:30am.
11. AA1/BB1/CC1/A1/B1/C1 : 4 working days 2 days off, 6:30am to 6:30pm ,
6:30pm to 6:30am.
12. GG/HH/II/GG1/HH1/II1 : 4 working days 2 days off, 7:30am to 7:30pm ,
7:30pm to 7:30am.
13. P1 : Monday to Thursday (4 working days 2 days off), 6:30am to
6:30pm , 6:30pm to 6:30am.
14. P2 : Tuesday to Friday (4 working days 2 days off), 6:30am to 6:30pm
, 6:30pm to 6:30am.
15. U1/U2/U3/UU1/UU2/UU3 : 4 working days 2 days off, 7:30am to 7.30pm.
16. V1/V2/V3/VV1/VV2/VV3 : 4 working days 2 days off, 8.30am to 8.30pm.
17. W1/W2/W3/WW1/WW2/WW3 : 4 working days 2 days off, 6.30am to 6.30pm.
18. H1 : Monday to Thursday (4 working days 2 days off), 6.30am to
6.30pm.
19. H2 : Tuesday to Friday (4 working days 2 days off), 6.30am to
6.30pm.
20. H3 : Wednesday to Saturday (4 working days 2 days off), 6.30am to
6.30pm.
21. H6(applicable in S only) : Monday to Thursday (4 working days 2 days
off), 7.30am to 7.30pm.
22. H6(applicable in M only) : Monday to Thursday (4 working days 2 days
off), 7.30am to 7.30pm.
- >-
1. 1st Shift: -Dinner (Malam): 8:00PM - 8:40PM, -Supper(Lewat Malam):
1:00AM - 1:30 AM -Breakfast(Pagi): 8:30AM - 9:00AM -Lunch(Tengah Hari):
1:40PM - 2:20PM.
2. 2nd Shift: -Dinner(Malam): 8:50PM - 9:30PM -Supper(Lewat Malam):
1:40AM - 2:10AM -Breakfast(Pagi): 9:10AM - 9:40AM -Lunch(Tengah Hari):
2:30PM - 3:10PM.
3. 3rd Shift: -Dinner(Malam): 9:40PM - 10:20PM -Supper(Lewat Malam):
2:20AM - 2:50AM -Breakfast(Pagi): 9:50AM - 10:20AM -Lunch(Tengah Hari):
3:20PM - 4:00PM.
- |-
1. 1. Yes, acceptable traditional attire includes:
2. 1. Malaysian Traditional Attire.
3. 2.Malay Baju Kurung.
4. 3. Baju Melayu for Muslim men.
5. 4.Indian Saree.
6. 5. Punjabi Suit.
7. Chinese Cheongsam are acceptable.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("PeYing/model1_v2")
# Run inference
sentences = [
'Can employees wear traditional attire?',
'1. 1. Yes, acceptable traditional attire includes: \n2. 1. Malaysian Traditional Attire. \n3. 2.Malay Baju Kurung. \n4. 3. Baju Melayu for Muslim men. \n5. 4.Indian Saree. \n6. 5. Punjabi Suit. \n7. Chinese Cheongsam are acceptable.',
'1. N03 : Monday to Friday, 8am to 5:30pm.\n2. N04 : Tuesday to Saturday, 8am to 5:30pm.\n3. N05 : Monday to Friday, 8:30am to 6pm.\n4. N06 : Monday to Friday, 9am to 6:30pm.\n5. N07 : Tuesday to Saturday, 8:30am to 6pm.\n6. N08 : Tuesday to Saturday, 9am to 6.30pm.\n7. N6 : Tuesday to Saturday, 8:30pm to 6:15pm.\n8. N9: 5 working days 2 days off, 7:30am to 5:15pm , 10:30am to 8:15pm.\n9. N10: 5 working days 2 days off, 10:30am to 8:15pm , 7:30am to 5:15pm.\n10. AA/BB/CC/A/B/C : 4 working days 2 days off, 6:30am to 6:30pm , 6:30pm to 6:30am.\n11. AA1/BB1/CC1/A1/B1/C1 : 4 working days 2 days off, 6:30am to 6:30pm , 6:30pm to 6:30am.\n12. GG/HH/II/GG1/HH1/II1 : 4 working days 2 days off, 7:30am to 7:30pm , 7:30pm to 7:30am.\n13. P1 : Monday to Thursday (4 working days 2 days off), 6:30am to 6:30pm , 6:30pm to 6:30am.\n14. P2 : Tuesday to Friday (4 working days 2 days off), 6:30am to 6:30pm , 6:30pm to 6:30am. \n15. U1/U2/U3/UU1/UU2/UU3 : 4 working days 2 days off, 7:30am to 7.30pm. \n16. V1/V2/V3/VV1/VV2/VV3 : 4 working days 2 days off, 8.30am to 8.30pm. \n17. W1/W2/W3/WW1/WW2/WW3 : 4 working days 2 days off, 6.30am to 6.30pm. \n18. H1 : Monday to Thursday (4 working days 2 days off), 6.30am to 6.30pm. \n19. H2 : Tuesday to Friday (4 working days 2 days off), 6.30am to 6.30pm. \n20. H3 : Wednesday to Saturday (4 working days 2 days off), 6.30am to 6.30pm. \n21. H6(applicable in S only) : Monday to Thursday (4 working days 2 days off), 7.30am to 7.30pm. \n22. H6(applicable in M only) : Monday to Thursday (4 working days 2 days off), 7.30am to 7.30pm.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 182 training samples
- Columns:
sentence_0
,sentence_1
, andlabel
- Approximate statistics based on the first 182 samples:
sentence_0 sentence_1 label type string string int details - min: 7 tokens
- mean: 14.43 tokens
- max: 36 tokens
- min: 5 tokens
- mean: 53.8 tokens
- max: 256 tokens
- 1: 100.00%
- Samples:
sentence_0 sentence_1 label List out all the work schedule for Carsem.
1. N03 : Monday to Friday, 8am to 5:30pm.
2. N04 : Tuesday to Saturday, 8am to 5:30pm.
3. N05 : Monday to Friday, 8:30am to 6pm.
4. N06 : Monday to Friday, 9am to 6:30pm.
5. N07 : Tuesday to Saturday, 8:30am to 6pm.
6. N08 : Tuesday to Saturday, 9am to 6.30pm.
7. N6 : Tuesday to Saturday, 8:30pm to 6:15pm.
8. N9: 5 working days 2 days off, 7:30am to 5:15pm , 10:30am to 8:15pm.
9. N10: 5 working days 2 days off, 10:30am to 8:15pm , 7:30am to 5:15pm.
10. AA/BB/CC/A/B/C : 4 working days 2 days off, 6:30am to 6:30pm , 6:30pm to 6:30am.
11. AA1/BB1/CC1/A1/B1/C1 : 4 working days 2 days off, 6:30am to 6:30pm , 6:30pm to 6:30am.
12. GG/HH/II/GG1/HH1/II1 : 4 working days 2 days off, 7:30am to 7:30pm , 7:30pm to 7:30am.
13. P1 : Monday to Thursday (4 working days 2 days off), 6:30am to 6:30pm , 6:30pm to 6:30am.
14. P2 : Tuesday to Friday (4 working days 2 days off), 6:30am to 6:30pm , 6:30pm to 6:30am.
15. U1/U2/U3/UU1/UU2/UU3 : 4 working days 2 days off, 7:30am to 7.30pm.
16. V1/V2/V3/VV1/VV...1
What is the maximum allowed working hours in a week?
1. Employees are not allowed to work more than 60 hours in a week inclusive of overtime and 1 rest day per week. Company will monitor overtime and rest day utilization and take appropriate action to address instances deemed excessive.
1
Why the company is not allowed working hours in a week more than 60 hours?
1. Continuous overtime causes worker strain that may lead to reduced productivity, increased turnover and increased injury and illnesses.
1
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 1per_device_eval_batch_size
: 1num_train_epochs
: 1multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 1per_device_eval_batch_size
: 1per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.4.1
- Transformers: 4.48.2
- PyTorch: 2.5.1+cu124
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}