CocoRoF's picture
Training in progress, step 3500, checkpoint
c1e2c46 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:449904
  - loss:CosineSimilarityLoss
base_model: x2bee/ModernBERT-SimCSE-multitask_v03
widget:
  - source_sentence: 우리는 움직이는 동행 우주 정지 좌표계에 비례하여 이동하고 있습니다 ...  371km / s에서 별자리 leo 쪽으로. "
    sentences:
      - 두 마리의 독수리가 가지에 앉는다.
      - 다른 물체와는 관련이 없는 '정지'는 없다.
      - 소녀는 버스의 열린 문 앞에 서 있다.
  - source_sentence: 숲에는 개들이 있다.
    sentences:
      - 양을 보는 아이들.
      - 여왕의 배우자를 "왕"이라고 부르지 않는 것은 아주 좋은 이유가 있다. 왜냐하면 그들은 왕이 아니기 때문이다.
      - 개들은 숲속에 혼자 있다.
  - source_sentence: '첫째, 두 가지 다른 종류의 대시가 있다는 것을 알아야 합니다 : en 대시와 em 대시.'
    sentences:
      - 그들은  물건들을  주변에 두고 가거나 집의 정리를 해칠 의도가 없다.
      - 세미콜론은 혼자 있을  있는 문장에 참여하는데 사용되지만, 그들의 관계를 강조하기 위해 결합됩니다.
      - 그의 남동생이 지켜보는 동안  앞에서 트럼펫을 연주하는 금발의 아이.
  - source_sentence:  여성이 생선 껍질을 벗기고 있다.
    sentences:
      -  남자가 수영장으로 뛰어들었다.
      -  여성이 프라이팬에 노란 혼합물을 부어 넣고 있다.
      -  마리의 갈색 개가  속에서 서로 놀고 있다.
  - source_sentence: 버스가 바쁜 길을 따라 운전한다.
    sentences:
      - 우리와 같은 태양계가 은하계 밖에서 존재할 수도 있을 것입니다.
      -  여자는 데이트하러 가는 중이다.
      - 녹색 버스가 도로를 따라 내려간다.
datasets:
  - x2bee/misc_sts_pairs_v2_kor_kosimcse
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_euclidean
  - spearman_euclidean
  - pearson_manhattan
  - spearman_manhattan
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
model-index:
  - name: SentenceTransformer based on x2bee/ModernBERT-SimCSE-multitask_v03
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts dev
          type: sts_dev
        metrics:
          - type: pearson_cosine
            value: 0.8319192467999278
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8396159085327265
            name: Spearman Cosine
          - type: pearson_euclidean
            value: 0.8198226408074469
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8285927601564604
            name: Spearman Euclidean
          - type: pearson_manhattan
            value: 0.8199114649719743
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8295556212626334
            name: Spearman Manhattan
          - type: pearson_dot
            value: 0.7234705763545461
            name: Pearson Dot
          - type: spearman_dot
            value: 0.7094397491074207
            name: Spearman Dot
          - type: pearson_max
            value: 0.8319192467999278
            name: Pearson Max
          - type: spearman_max
            value: 0.8396159085327265
            name: Spearman Max

SentenceTransformer based on x2bee/ModernBERT-SimCSE-multitask_v03

This is a sentence-transformers model finetuned from x2bee/ModernBERT-SimCSE-multitask_v03 on the misc_sts_pairs_v2_kor_kosimcse dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("x2bee/ModernBERT-SimCSE-multitask_v03-beta")
# Run inference
sentences = [
    '버스가 바쁜 길을 따라 운전한다.',
    '녹색 버스가 도로를 따라 내려간다.',
    '그 여자는 데이트하러 가는 중이다.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8319
spearman_cosine 0.8396
pearson_euclidean 0.8198
spearman_euclidean 0.8286
pearson_manhattan 0.8199
spearman_manhattan 0.8296
pearson_dot 0.7235
spearman_dot 0.7094
pearson_max 0.8319
spearman_max 0.8396

Training Details

Training Dataset

misc_sts_pairs_v2_kor_kosimcse

  • Dataset: misc_sts_pairs_v2_kor_kosimcse at e747415
  • Size: 449,904 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 6 tokens
    • mean: 18.3 tokens
    • max: 69 tokens
    • min: 6 tokens
    • mean: 18.69 tokens
    • max: 66 tokens
    • min: 0.11
    • mean: 0.77
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    주홍글씨는 언제 출판되었습니까? 《주홍글씨》는 몇 년에 출판되었습니까? 0.8638778924942017
    폴란드에서 빨간색과 흰색은 무엇을 의미합니까? 폴란드 국기의 색상은 무엇입니까? 0.6773715019226074
    노르만인들은 방어를 위해 모트와 베일리 성을 어떻게 사용했는가? 11세기에는 어떻게 모트와 베일리 성을 만들었습니까? 0.7460665702819824
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 1,500 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 7 tokens
    • mean: 20.38 tokens
    • max: 52 tokens
    • min: 6 tokens
    • mean: 20.52 tokens
    • max: 54 tokens
    • min: 0.0
    • mean: 0.42
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    안전모를 가진 한 남자가 춤을 추고 있다. 안전모를 쓴 한 남자가 춤을 추고 있다. 1.0
    어린아이가 말을 타고 있다. 아이가 말을 타고 있다. 0.95
    한 남자가 뱀에게 쥐를 먹이고 있다. 남자가 뱀에게 쥐를 먹이고 있다. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • overwrite_output_dir: True
  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 8
  • learning_rate: 8e-05
  • num_train_epochs: 2.0
  • warmup_ratio: 0.2
  • push_to_hub: True
  • hub_model_id: x2bee/ModernBERT-SimCSE-multitask_v03-beta
  • hub_strategy: checkpoint
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: True
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 8e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2.0
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: x2bee/ModernBERT-SimCSE-multitask_v03-beta
  • hub_strategy: checkpoint
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts_dev_spearman_max
0.0028 10 0.0216 - -
0.0057 20 0.0204 - -
0.0085 30 0.0194 - -
0.0114 40 0.0195 - -
0.0142 50 0.0182 - -
0.0171 60 0.0161 - -
0.0199 70 0.015 - -
0.0228 80 0.0153 - -
0.0256 90 0.0137 - -
0.0285 100 0.014 - -
0.0313 110 0.0122 - -
0.0341 120 0.0114 - -
0.0370 130 0.0109 - -
0.0398 140 0.0097 - -
0.0427 150 0.0085 - -
0.0455 160 0.0084 - -
0.0484 170 0.0083 - -
0.0512 180 0.0078 - -
0.0541 190 0.008 - -
0.0569 200 0.0073 - -
0.0597 210 0.0079 - -
0.0626 220 0.0073 - -
0.0654 230 0.0079 - -
0.0683 240 0.0068 - -
0.0711 250 0.0068 0.0333 0.8229
0.0740 260 0.0073 - -
0.0768 270 0.0077 - -
0.0797 280 0.0067 - -
0.0825 290 0.007 - -
0.0854 300 0.0065 - -
0.0882 310 0.0072 - -
0.0910 320 0.0068 - -
0.0939 330 0.0064 - -
0.0967 340 0.0074 - -
0.0996 350 0.0071 - -
0.1024 360 0.0065 - -
0.1053 370 0.0067 - -
0.1081 380 0.0063 - -
0.1110 390 0.0062 - -
0.1138 400 0.0068 - -
0.1166 410 0.0064 - -
0.1195 420 0.0064 - -
0.1223 430 0.0064 - -
0.1252 440 0.0074 - -
0.1280 450 0.0069 - -
0.1309 460 0.0065 - -
0.1337 470 0.0067 - -
0.1366 480 0.0068 - -
0.1394 490 0.0057 - -
0.1423 500 0.0065 0.0343 0.8284
0.1451 510 0.0069 - -
0.1479 520 0.0068 - -
0.1508 530 0.0065 - -
0.1536 540 0.0065 - -
0.1565 550 0.0063 - -
0.1593 560 0.0058 - -
0.1622 570 0.0064 - -
0.1650 580 0.0062 - -
0.1679 590 0.0061 - -
0.1707 600 0.0062 - -
0.1735 610 0.0057 - -
0.1764 620 0.0066 - -
0.1792 630 0.0061 - -
0.1821 640 0.0054 - -
0.1849 650 0.0066 - -
0.1878 660 0.0059 - -
0.1906 670 0.0063 - -
0.1935 680 0.0065 - -
0.1963 690 0.0065 - -
0.1992 700 0.0058 - -
0.2020 710 0.006 - -
0.2048 720 0.0062 - -
0.2077 730 0.0058 - -
0.2105 740 0.0058 - -
0.2134 750 0.0056 0.0356 0.8302
0.2162 760 0.0067 - -
0.2191 770 0.0063 - -
0.2219 780 0.0063 - -
0.2248 790 0.0063 - -
0.2276 800 0.0056 - -
0.2304 810 0.0058 - -
0.2333 820 0.0053 - -
0.2361 830 0.0057 - -
0.2390 840 0.0055 - -
0.2418 850 0.0054 - -
0.2447 860 0.0065 - -
0.2475 870 0.0054 - -
0.2504 880 0.0051 - -
0.2532 890 0.0057 - -
0.2561 900 0.0056 - -
0.2589 910 0.0055 - -
0.2617 920 0.0051 - -
0.2646 930 0.0055 - -
0.2674 940 0.0059 - -
0.2703 950 0.005 - -
0.2731 960 0.0058 - -
0.2760 970 0.005 - -
0.2788 980 0.0055 - -
0.2817 990 0.0054 - -
0.2845 1000 0.0055 0.0360 0.8319
0.2874 1010 0.0059 - -
0.2902 1020 0.0049 - -
0.2930 1030 0.0052 - -
0.2959 1040 0.0051 - -
0.2987 1050 0.006 - -
0.3016 1060 0.0048 - -
0.3044 1070 0.0055 - -
0.3073 1080 0.0052 - -
0.3101 1090 0.0051 - -
0.3130 1100 0.0051 - -
0.3158 1110 0.005 - -
0.3186 1120 0.0054 - -
0.3215 1130 0.0051 - -
0.3243 1140 0.0054 - -
0.3272 1150 0.0056 - -
0.3300 1160 0.0053 - -
0.3329 1170 0.0052 - -
0.3357 1180 0.0051 - -
0.3386 1190 0.0051 - -
0.3414 1200 0.0048 - -
0.3443 1210 0.005 - -
0.3471 1220 0.0055 - -
0.3499 1230 0.0049 - -
0.3528 1240 0.0053 - -
0.3556 1250 0.0052 0.0364 0.8330
0.3585 1260 0.0051 - -
0.3613 1270 0.005 - -
0.3642 1280 0.005 - -
0.3670 1290 0.0045 - -
0.3699 1300 0.0055 - -
0.3727 1310 0.0049 - -
0.3755 1320 0.0049 - -
0.3784 1330 0.0053 - -
0.3812 1340 0.005 - -
0.3841 1350 0.0048 - -
0.3869 1360 0.0049 - -
0.3898 1370 0.0046 - -
0.3926 1380 0.0049 - -
0.3955 1390 0.0052 - -
0.3983 1400 0.005 - -
0.4012 1410 0.0052 - -
0.4040 1420 0.0052 - -
0.4068 1430 0.0045 - -
0.4097 1440 0.0046 - -
0.4125 1450 0.0056 - -
0.4154 1460 0.0056 - -
0.4182 1470 0.005 - -
0.4211 1480 0.0051 - -
0.4239 1490 0.0049 - -
0.4268 1500 0.0048 0.0374 0.8334
0.4296 1510 0.0053 - -
0.4324 1520 0.0054 - -
0.4353 1530 0.0048 - -
0.4381 1540 0.005 - -
0.4410 1550 0.0045 - -
0.4438 1560 0.0046 - -
0.4467 1570 0.0045 - -
0.4495 1580 0.0049 - -
0.4524 1590 0.0048 - -
0.4552 1600 0.005 - -
0.4581 1610 0.0045 - -
0.4609 1620 0.0049 - -
0.4637 1630 0.0044 - -
0.4666 1640 0.0048 - -
0.4694 1650 0.0049 - -
0.4723 1660 0.0048 - -
0.4751 1670 0.0051 - -
0.4780 1680 0.0047 - -
0.4808 1690 0.0048 - -
0.4837 1700 0.0047 - -
0.4865 1710 0.0044 - -
0.4893 1720 0.0049 - -
0.4922 1730 0.0049 - -
0.4950 1740 0.0051 - -
0.4979 1750 0.0043 0.0392 0.8352
0.5007 1760 0.0043 - -
0.5036 1770 0.0045 - -
0.5064 1780 0.0046 - -
0.5093 1790 0.0042 - -
0.5121 1800 0.0047 - -
0.5150 1810 0.0047 - -
0.5178 1820 0.0046 - -
0.5206 1830 0.0044 - -
0.5235 1840 0.0046 - -
0.5263 1850 0.0047 - -
0.5292 1860 0.0044 - -
0.5320 1870 0.0047 - -
0.5349 1880 0.0049 - -
0.5377 1890 0.0049 - -
0.5406 1900 0.0047 - -
0.5434 1910 0.0045 - -
0.5462 1920 0.0044 - -
0.5491 1930 0.0048 - -
0.5519 1940 0.0041 - -
0.5548 1950 0.004 - -
0.5576 1960 0.0048 - -
0.5605 1970 0.0042 - -
0.5633 1980 0.0048 - -
0.5662 1990 0.0045 - -
0.5690 2000 0.0043 0.0375 0.8359
0.5719 2010 0.005 - -
0.5747 2020 0.0049 - -
0.5775 2030 0.0044 - -
0.5804 2040 0.0045 - -
0.5832 2050 0.0043 - -
0.5861 2060 0.0045 - -
0.5889 2070 0.004 - -
0.5918 2080 0.0042 - -
0.5946 2090 0.0044 - -
0.5975 2100 0.0043 - -
0.6003 2110 0.0041 - -
0.6032 2120 0.0046 - -
0.6060 2130 0.0048 - -
0.6088 2140 0.0048 - -
0.6117 2150 0.0041 - -
0.6145 2160 0.0044 - -
0.6174 2170 0.0045 - -
0.6202 2180 0.0044 - -
0.6231 2190 0.0044 - -
0.6259 2200 0.0046 - -
0.6288 2210 0.0048 - -
0.6316 2220 0.0045 - -
0.6344 2230 0.004 - -
0.6373 2240 0.0041 - -
0.6401 2250 0.0044 0.0391 0.8369
0.6430 2260 0.0044 - -
0.6458 2270 0.0045 - -
0.6487 2280 0.0041 - -
0.6515 2290 0.0042 - -
0.6544 2300 0.0043 - -
0.6572 2310 0.004 - -
0.6601 2320 0.0042 - -
0.6629 2330 0.0041 - -
0.6657 2340 0.0045 - -
0.6686 2350 0.0045 - -
0.6714 2360 0.0042 - -
0.6743 2370 0.0045 - -
0.6771 2380 0.0044 - -
0.6800 2390 0.0044 - -
0.6828 2400 0.0041 - -
0.6857 2410 0.0045 - -
0.6885 2420 0.0046 - -
0.6913 2430 0.0041 - -
0.6942 2440 0.0048 - -
0.6970 2450 0.0041 - -
0.6999 2460 0.0043 - -
0.7027 2470 0.0043 - -
0.7056 2480 0.0037 - -
0.7084 2490 0.0042 - -
0.7113 2500 0.0043 0.0405 0.8365
0.7141 2510 0.0045 - -
0.7170 2520 0.0044 - -
0.7198 2530 0.0042 - -
0.7226 2540 0.0042 - -
0.7255 2550 0.0041 - -
0.7283 2560 0.0042 - -
0.7312 2570 0.0041 - -
0.7340 2580 0.0042 - -
0.7369 2590 0.0041 - -
0.7397 2600 0.0047 - -
0.7426 2610 0.0038 - -
0.7454 2620 0.0041 - -
0.7482 2630 0.0042 - -
0.7511 2640 0.0042 - -
0.7539 2650 0.0042 - -
0.7568 2660 0.0041 - -
0.7596 2670 0.0042 - -
0.7625 2680 0.0044 - -
0.7653 2690 0.0039 - -
0.7682 2700 0.0037 - -
0.7710 2710 0.0044 - -
0.7739 2720 0.0043 - -
0.7767 2730 0.0042 - -
0.7795 2740 0.0041 - -
0.7824 2750 0.0039 0.0387 0.8376
0.7852 2760 0.0047 - -
0.7881 2770 0.004 - -
0.7909 2780 0.0039 - -
0.7938 2790 0.0039 - -
0.7966 2800 0.0039 - -
0.7995 2810 0.0039 - -
0.8023 2820 0.0039 - -
0.8051 2830 0.0041 - -
0.8080 2840 0.0037 - -
0.8108 2850 0.0044 - -
0.8137 2860 0.0043 - -
0.8165 2870 0.0041 - -
0.8194 2880 0.0043 - -
0.8222 2890 0.0039 - -
0.8251 2900 0.0041 - -
0.8279 2910 0.0044 - -
0.8308 2920 0.004 - -
0.8336 2930 0.0042 - -
0.8364 2940 0.0039 - -
0.8393 2950 0.004 - -
0.8421 2960 0.0042 - -
0.8450 2970 0.004 - -
0.8478 2980 0.0039 - -
0.8507 2990 0.0037 - -
0.8535 3000 0.0039 0.0386 0.8386
0.8564 3010 0.0041 - -
0.8592 3020 0.0043 - -
0.8621 3030 0.0041 - -
0.8649 3040 0.0041 - -
0.8677 3050 0.0043 - -
0.8706 3060 0.0042 - -
0.8734 3070 0.0039 - -
0.8763 3080 0.004 - -
0.8791 3090 0.0039 - -
0.8820 3100 0.0039 - -
0.8848 3110 0.004 - -
0.8877 3120 0.0039 - -
0.8905 3130 0.0038 - -
0.8933 3140 0.0036 - -
0.8962 3150 0.0039 - -
0.8990 3160 0.0039 - -
0.9019 3170 0.0038 - -
0.9047 3180 0.0039 - -
0.9076 3190 0.0041 - -
0.9104 3200 0.004 - -
0.9133 3210 0.0041 - -
0.9161 3220 0.0042 - -
0.9190 3230 0.004 - -
0.9218 3240 0.0041 - -
0.9246 3250 0.0041 0.0420 0.8408
0.9275 3260 0.0041 - -
0.9303 3270 0.004 - -
0.9332 3280 0.0042 - -
0.9360 3290 0.004 - -
0.9389 3300 0.0037 - -
0.9417 3310 0.0038 - -
0.9446 3320 0.0039 - -
0.9474 3330 0.004 - -
0.9502 3340 0.0037 - -
0.9531 3350 0.0038 - -
0.9559 3360 0.0037 - -
0.9588 3370 0.0042 - -
0.9616 3380 0.0042 - -
0.9645 3390 0.0042 - -
0.9673 3400 0.0037 - -
0.9702 3410 0.0038 - -
0.9730 3420 0.0039 - -
0.9759 3430 0.0038 - -
0.9787 3440 0.0041 - -
0.9815 3450 0.004 - -
0.9844 3460 0.0039 - -
0.9872 3470 0.0036 - -
0.9901 3480 0.0037 - -
0.9929 3490 0.0039 - -
0.9958 3500 0.0036 0.0403 0.8396

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0.dev0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.0
  • Datasets: 3.1.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}