SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    "TechnicalProficiencies DB: Oracle 11g Domains: Investment Banking, Advertising, Insurance. Programming Skills: SQL, PLSQL BI Tools: Informatica 9.1 OS: Windows, Unix Professional Development Trainings â\x80¢ Concepts in Data Warehousing, Business Intelligence, ETL. â\x80¢ BI Tools -Informatica 9X Education Details \r\n BCA  Nanded, Maharashtra Nanded University\r\nETL Developer \r\n\r\nETL Developer - Sun Trust Bank NY\r\nSkill Details \r\nETL- Exprience - 39 months\r\nEXTRACT, TRANSFORM, AND LOAD- Exprience - 39 months\r\nINFORMATICA- Exprience - 39 months\r\nORACLE- Exprience - 39 months\r\nUNIX- Exprience - 39 monthsCompany Details \r\ncompany - Sun Trust Bank NY\r\ndescription - Sun Trust Bank, NY JAN 2018 to present\r\nClient: Sun Trust Bank NY\r\nEnvironment: Informatica Power Center 9.1, Oracle 11g, unix.\r\n\r\nRole: ETL Developer\r\n\r\nProject Profile:\r\nSun Trust Bank is a US based multinational financial services holding company, headquarters in NY that operates the Bank in New York and other financial services investments. The company is organized as a stock corporation with four divisions: investment banking, private banking, Retail banking and a shared services group that provides\r\nFinancial services and support to the other divisions.\r\nThe objective of the first module was to create a DR system for the bank with a central point of communication and storage for Listed, Cash securities, Loans, Bonds, Notes, Equities, Rates, Commodities, and\r\nFX asset classes.\r\nContribution / Highlights:\r\n\r\nâ\x80¢ Liaising closely with Project Manager, Business Analysts, Product Architects, and Requirements Modelers (CFOC) to define Technical requirements and create project documentation.\r\nâ\x80¢ Development using Infa 9.1, 11g/Oracle, UNIX.\r\nâ\x80¢ Use Informatica PowerCenter for extraction, transformation and loading (ETL) of data in the Database.\r\nâ\x80¢ Created and configured Sessions in Informatica workflow Manager for loading data into Data base tables from various heterogeneous database sources like Flat Files, Oracle etc.\r\nâ\x80¢ Unit testing and system integration testing of the developed mappings.\r\nâ\x80¢ Providing production Support of the deployed code.\r\nâ\x80¢ Providing solutions to the business for the Production issues.\r\nâ\x80¢ Had one to One interaction with the client throughout the project and in daily meetings.\r\n\r\nProject #2\r\ncompany - Marshall Multimedia\r\ndescription - JUN 2016 to DEC 2017\r\n\r\nClient: Marshall Multimedia\r\nEnvironment: Informatica Power Center 9.1, Oracle 11g, unix.\r\n\r\nRole: ETL Developer\r\n\r\nProject Profile:\r\nMarshall Multimedia is a US based multimedia advertisement services based organization which has\r\nhead courter in New York. EGC interface systems are advert management, Customer Management, Billing and\r\nProvisioning Systems for Consumer& Enterprise Customers.\r\nThe main aim of the project was to create an enterprise data warehouse which would suffice the need of reports belonging to the following categories: Financial reports, management reports and\r\nrejection reports. The professional reports were created by Cognos and ETL work was performed by\r\nInformatica. This project is to load the advert details and magazine details coming in Relational tables into data warehouse and calculate the compensation and incentive amount monthly twice as per business\r\nrules.\r\n\r\nContribution / Highlights:\r\nâ\x80¢ Developed mappings using different sources by using Informatica transformations.\r\nâ\x80¢ Created and configured Sessions in Informatica workflow Manager for loading data into Data Mart tables from various heterogeneous database sources like Flat Files, Oracle etc.\r\n\r\n2\r\nâ\x80¢ Unit testing and system integration testing of the developed mappings.\r\nâ\x80¢ Providing solutions to the business for the Production issues.\r\n\r\nProject #3\r\ncompany - Assurant healthcare/Insurance Miami USA\r\ndescription - Assurant, USA                                                                                                    NOV 2015 to MAY 2016\r\n\r\nProject: ACT BI - State Datamart\r\nClient: Assurant healthcare/Insurance Miami USA\r\nEnvironment: Informatica Power Center 9.1, Oracle 11g, unix.\r\n\r\nRole: ETL Developer\r\n\r\nProject Profile:\r\nAssurant, Inc. is a holding company with businesses that provide a diverse set of specialty, niche-market insurance\r\nproducts in the property, casualty, life and health insurance sectors. The company's four operating segments are Assurant\r\nEmployee Benefits, Assurant Health, Assurant Solutions and Assurant Specialty Property.\r\nThe project aim at building State Datamart for enterprise solution. I am part of team which is responsible for ETL\r\nDesign & development along with testing.\r\n\r\nContribution / Highlights:\r\nâ\x80¢   Performed small enhancement\r\nâ\x80¢   Daily load monitoring\r\nâ\x80¢   Attend to Informatica job failures by analyzing the root cause, resolving the failure using standard\r\ndocumented process.\r\nâ\x80¢   Experience in writing SQL statements.\r\nâ\x80¢   Strong Problem Analysis & Resolution skills and ability to work in Multi Platform Environments\r\nâ\x80¢   Scheduled the Informatica jobs using Informatica scheduler\r\nâ\x80¢   Extensively used ETL methodology for developing and supporting data extraction, transformations and loading process, in a corporate-wide-ETL Solution using Informatica.\r\nâ\x80¢   Involved in creating the Unit cases and uploaded in to Quality Center for Unit Testing and UTR\r\nâ\x80¢   Ensure that daily support tasks are done in accordance with the defined SLA.",
    'I am looking for an opportunity that would provide me with a chance to learn and enhance my skills in the Oracle Financials domain. I have 4+ years of experience in the domain and have worked with various clients. I have been working in the finance domain for 9+ years. I have worked in Oracle Apps Financials and have experience in Oracle Financials 11i, R12. I am also proficient in Financial Services • ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢',
    "The incumbent would be responsible for testing and maintenance of the Transformers, BPCB's, Transformer, PCC, MCC, HV cables, LV cables with respect to the electrical and mechanical aspects.\n\nJob Requirements:\n- B.E. / B.Tech. (Electrical/Mechanical) with minimum 60% aggregate.\n- Minimum 2 years of experience in testing and maintenance of transformers, BPCB's, Transformer, PCC, HV cables, LV cables.\n- Knowledge of transformer ratio test, transformer vector group test, transformer magnetic balance test, transformer tripping protection command, etc.\n- Knowledge of working of electrical/mechanical systems and related components (like motors, starters, etc.)\n- Knowledge of electrical/mechanical maintenance of transformers etc.\n- Ability to check transformer/MCC/PCC/HV cables/LV cables for defects and to work on them to fix",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8837
spearman_cosine 0.8724

Training Details

Training Dataset

Unnamed Dataset

  • Size: 864 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 864 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 24 tokens
    • mean: 316.25 tokens
    • max: 384 tokens
    • min: 3 tokens
    • mean: 164.37 tokens
    • max: 218 tokens
    • min: 0.0
    • mean: 0.56
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    KEY SKILLS: • Computerized accounting with tally • Sincere & hard working • Management accounting & income tax • Good communication & leadership • Two and four wheeler driving license • Internet & Ecommerce management COMPUTER SKILLS: • C Language • Web programing • Tally • Dbms Education Details

    June 2017 to June 2019 Mba Finance/hr India Mlrit

    June 2014 to June 2017 Bcom Computer Hyderabad, Telangana Osmania university

    June 2012 to April 2014 Inter MEC India Srimedhav

    Hr


    Nani

    Skill Details

    accounting- Exprience - 6 months

    DATABASE MANAGEMENT SYSTEM- Exprience - 6 months

    Dbms- Exprience - 6 months

    Management accounting- Exprience - 6 months

    Ecommerce- Exprience - 6 monthsCompany Details

    company - Valuelabs

    description - They will give the RRF form the required DLT then the hand over to RLT then scrum master will take the form from the RLT then scrum master will give the forms to trainee which we can work on the requirement till the candidate rece...
    We are looking for a hardworking and self-motivated candidate who can implement strategies to maximize sales. Key responsibilities will include:

    1. Sales and Customer Service:
    Identify and develop new customers and maintain a successful relationship with them. Develop sales strategies and objectives and work with the marketing team to ensure that sales are achieved. Coordinate sales efforts with the customer service team.
    2. Sales Administration:
    Coordinate sales with administrative functions and maintain records. Conducting market research and analyzing data. Prepare sales forecasts and reports.
    3. Business Management:
    Manage customer service team, sales team and marketing team to ensure sales and customer satisfaction are met. Develop a business strategy to achieve a competitive advantage in the marketplace.
    4. Sales Promotion:
    Develop, maintain and execute sales promotion plans.
    5. Sales Analysis:
    Analyze sales performance and develop sales strategies and objectives.

    Key Ski...
    0.5287528648371803
    IT SKILLS • Well versed with MS Office and Internet Applications and various ERP systems implemented in the company ie.SAGE, Flotilla, LM ERP, Tally 9, WMS, Exceed 4000 etc PERSONAL DOSSIER Permanent Address: Bandra West, Mumbai 400 050Education Details

    B.Com commerce Mumbai, Maharashtra Bombay University

    Mumbai, Maharashtra St. Andrews College

    DIM Business Management IGNOU

    Operations Manager


    Operations Manager - Landmark Insurance Brokers Pvt Ltd

    Skill Details

    EMPLOYEE RESOURCE GROUP- Exprience - 6 months

    ENTERPRISE RESOURCE PLANNING- Exprience - 6 months

    ERP- Exprience - 6 months

    MS OFFICE- Exprience - 6 months

    Tally- Exprience - 6 monthsCompany Details

    company - Landmark Insurance Brokers Pvt Ltd

    description - Jan 2019 till Date

    About the Company

    One of India Largest Insurance Brokerage firms with offices across 24 states PAN India and a part of the LandmarkGroup with an annual turnover of 2200 cr


    Position: Operations Manager

    Leading and overseeing a...
    • A company with a very strong reputation for a high performance culture and strong customer focus is looking to recruit talented and motivated individuals to work within the Customer Service Team.
    • You will be responsible for handling customer enquiries and queries from a wide range of customers. You will be working with other teams within the company to ensure that customers have a seamless experience.
    • Your role will be to ensure that all customers are satisfied with the service they receive from the business.
    • You will be responsible for ensuring that all customer queries are handled in a timely manner to ensure that customers have a seamless experience with the business.
    • This role will require you to handle a high volume of calls and emails daily.
    • You will need to have a strong customer focus and be able to work in a fast paced environment.
    • You will need to be able
    0.3646167498289064
    TECHNICAL STRENGTHS Computer Language Java/J2EE, Swift, HTML, Shell script, MySQL Databases MySQL Tools SVN, Jenkins, Hudson, Weblogic12c Software Android Studio, Eclipse, Oracle, Xcode Operating Systems Win 10, Mac (High Sierra) Education Details

    June 2016 B.E. Information Technology Goregaon, MAHARASHTRA, IN Vidyalankar Institute of Technology

    May 2013 Mumbai, Maharashtra Thakur Polytechnic

    May 2010 Mumbai, Maharashtra St. John's Universal School

    Java developer


    Java developer - Tech Mahindra

    Skill Details

    JAVA- Exprience - 21 months

    MYSQL- Exprience - 21 months

    DATABASES- Exprience - 17 months

    J2EE- Exprience - 17 months

    ANDROID- Exprience - 6 monthsCompany Details

    company - Tech Mahindra

    description - Team Size: 5

    Environment: Java, Mysql, Shell script.

    Webserver: Jenkins.

    Description: OR-Formatter is an application which takes the input file as Geneva Modified File GMF from Geneva server and reads the data to generate Bill backup and Bill Invoices for Clie...
    We are looking for a Java Developer to join our growing team. We will be looking for a highly skilled developer with experience in Java/J2EE, Shell script, HTML, MYSQL, Databases, Java Tools, Android, and iOS.

    TECHNICAL SKILL
    0.5360567140232494
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step validation_spearman_cosine
1.0 54 0.8040
1.8519 100 0.8637
2.0 108 0.8596
3.0 162 0.8724

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.2
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for maashimho/tuned_for_project

Finetuned
(286)
this model

Evaluation results