SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Nashhz/FLanceBERT-all-MiniLM-L6-v2")
# Run inference
sentences = [
    "I'm here to provide comprehensive support across targeted email collection, web research, market research, data mining, data scraping, and lead generation, SEO & WordPress Web Development. My Expertise Lead Generation B2B & B2C List Building LinkedIn Lead Generation Prospect Lists LinkedIn Data Entry & Data Mining Data Extraction & Scraping Data Collection Tools for Lead Generation LinkedIn Sales Navigator Premium Apollo Premium SalesQL Premium CrunchBase Pro Premium",
    "As a chemical manufacturing company, we're in need of a digital marketing expert who can help us generate leads and extend our reach to our target B2B customers. This project will primarily focus on LinkedIn, with additional SEO optimization for our website. Your tasks will include - Optimizing our LinkedIn profile for maximum visibility and engagement - Creating a variety of content for LinkedIn, including - Informative articles - Case studies - Promotional videos - Festival themed content - Implementing SEO strategies to improve our website's reach and lead generation potential Ideal skills and experience for the job include - Proven experience in B2B digital marketing, particularly on LinkedIn - Strong content creation skills - Expertise in SEO optimization - Familiarity with the chemical manufacturing industry is a plus",
    "I'm in need of an Excel expert with proficiency in VBA and macros. The primary tasks you'll be tackling include data analysis, reporting, and data manipulation on sales and inventory data. Key functions that the workbook should effectively perform includes - Effective data analysis and reporting. Your prowess in Excel should ensure seamless interpretation and presentation of data. - Automation of data manipulation. Your skills should ease the process of handling large volumes of data, automatically organizing and adjusting it as necessary. - Specific calculations to provide inventory tracking and forecasting insights. Your expertise will help me make informed business decisions based on precise and timely data analysis. Proven experience handling similar projects would be advantageous.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 16,682 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 4 tokens
    • mean: 166.61 tokens
    • max: 256 tokens
    • min: 5 tokens
    • mean: 167.91 tokens
    • max: 256 tokens
    • min: 0.32
    • mean: 0.72
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    I have been employed in this field for almost seven years, and I have knowledge of Graphic Design- - Adobe Photoshop - Adobe Illustrator - Blender - Live2d - Adobe After Effects - 2D Animation Explainer Video I'm in need of a skilled video editor specializing in 2D animation. The primary purpose of this video is entertainment, with the style being animated. The ideal freelancer for this project should have - Extensive experience in editing 2D animated videos - A strong understanding of timing and pacing for comedic effect - The ability to help elevate the quality of the footage If you have a keen eye for detail and a passion for animation, I'd love to see your portfolio and discuss how we can bring this project to life. 0.7088025808334351
    Hi, I am Anis. I'm a professional Graphic Designer and Social Media Expert with more than 5 years experience. I will design T-shirt, Logo, Facebook Page, Facebook cover,poster,Banner for your Business or fan page, Facebook Shop, Social Media Marketing. I will bring life to your expectations. My Services Logo Design Business Card Design Blog Design Poster Design Banner Design T-shirt Design Youtube ThumbnailChannel Art Facebook coverfan pageBusiness page Instagram storypost more Hi, I am Anis. I'm a professional Graphic Designer and Social Media Expert with more than 5 years experience. I will design T-shirt, Logo, Facebook Page, Facebook cover,poster,Banner for your Business or fan page, Facebook Shop, Social Media Marketing. I will bring life to your expectations. My Services Logo Design Business Card Design Blog Design Poster Design Banner Design T-shirt Design Youtube ThumbnailChannel Art Facebook coverfan pageBusiness page Instagram storypost Flyer Design Brochure Design Any kind of Invitation cardbirthday,anniversary etc If you have a specific requirement which is NOT listed above, write me and I'll most probably be able to help you I will bring life to your expectations I'm seeking a graphic designer to create clean, modern designs for my photography business. This will start with business cards and a flyer based on my existing branding. Key Responsibilities - Design of business cards and flyer - Ongoing design tasks The objective of these designs is primarily to generate leads. I have some ideas about my brand but I need your expertise to finalize everything. The business cards will include my logo, contact information, tagline, and social media handles. Ideal Skills and Experience - Proficient in graphic design software - Experience in creating modern business promotional materials - Strong understanding of lead generation through design - Ability to work with and refine existing brand guidelines - Excellent communication skills for collaborative brainstorming This role will be paid at an hourly rate, as there are likely to be ongoing small and larger tasks. 0.7025933265686035
    I'm a Full Stack Web Developer with 4 years of experience in building responsive and user-friendly web applications. I specialize in both front-end and back-end development, using technologies like HTML, CSS, JavaScript, Taillwind css, Bootstrap and Vue.js. I'm passionate about solving complex problems and creating seamless digital experiences. I thrive in collaborative environments and am always eager to learn and take on new challenges. I'm in need of a skilled Full Stack Developer for an urgent task involving the development of a based website. Key Requirements - Proficient in both front-end and back-end web development - Experienced in creating user-friendly, responsive and interactive websites - Knowledgeable in implementing SEO best practices - Able to ensure high performance and responsiveness of the website Ideal Skills - Proficiency in HTML, CSS, JavaScript, PHP, Python, or Ruby - Experience with frameworks like React, Angular, or Vue.js - Familiarity with database management systems like MySQL or MongoDB - Previous experience in developing a blog or content-based website is a plus Looking forward to your bids. 0.7718963623046875
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 4
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.4794 500 0.001
0.9588 1000 0.0004
1.4382 1500 0.0003
1.9175 2000 0.0003
2.3969 2500 0.0003
2.8763 3000 0.0003
3.3557 3500 0.0002
3.8351 4000 0.0002

Framework Versions

  • Python: 3.12.6
  • Sentence Transformers: 3.2.0
  • Transformers: 4.45.2
  • PyTorch: 2.4.1+cpu
  • Accelerate: 1.0.1
  • Datasets: 3.0.1
  • Tokenizers: 0.20.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
316
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Nashhz/FLanceBERT-all-MiniLM-L6-v2

Finetuned
(260)
this model