radoslavralev commited on
Commit
0120d04
·
verified ·
1 Parent(s): 78c39f2

Add new SentenceTransformer model

Browse files
Files changed (1) hide show
  1. README.md +166 -19
README.md CHANGED
@@ -83,28 +83,28 @@ model-index:
83
  type: test
84
  metrics:
85
  - type: cosine_accuracy@1
86
- value: 0.5955802603036876
87
  name: Cosine Accuracy@1
88
  - type: cosine_precision@1
89
- value: 0.5955802603036876
90
  name: Cosine Precision@1
91
  - type: cosine_recall@1
92
- value: 0.5780913232288468
93
  name: Cosine Recall@1
94
  - type: cosine_ndcg@10
95
- value: 0.777639866271746
96
  name: Cosine Ndcg@10
97
  - type: cosine_mrr@1
98
- value: 0.5955802603036876
99
  name: Cosine Mrr@1
100
  - type: cosine_map@100
101
- value: 0.7275779687157514
102
  name: Cosine Map@100
103
  - type: cosine_auc_precision_cache_hit_ratio
104
- value: 0.3639683124583609
105
  name: Cosine Auc Precision Cache Hit Ratio
106
  - type: cosine_auc_similarity_distribution
107
- value: 0.15401896350374616
108
  name: Cosine Auc Similarity Distribution
109
  ---
110
 
@@ -169,9 +169,9 @@ print(embeddings.shape)
169
  # Get the similarity scores for the embeddings
170
  similarities = model.similarity(embeddings, embeddings)
171
  print(similarities)
172
- # tensor([[1.0000, 1.0000, 0.8359],
173
- # [1.0000, 1.0000, 0.8359],
174
- # [0.8359, 0.8359, 0.9961]], dtype=torch.bfloat16)
175
  ```
176
 
177
  <!--
@@ -209,13 +209,13 @@ You can finetune this model on your own dataset.
209
 
210
  | Metric | Value |
211
  |:-------------------------------------|:-----------|
212
- | cosine_accuracy@1 | 0.5956 |
213
- | cosine_precision@1 | 0.5956 |
214
- | cosine_recall@1 | 0.5781 |
215
- | **cosine_ndcg@10** | **0.7776** |
216
- | cosine_mrr@1 | 0.5956 |
217
- | cosine_map@100 | 0.7276 |
218
- | cosine_auc_precision_cache_hit_ratio | 0.364 |
219
  | cosine_auc_similarity_distribution | 0.154 |
220
 
221
  <!--
@@ -286,10 +286,157 @@ You can finetune this model on your own dataset.
286
  }
287
  ```
288
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
289
  ### Training Logs
290
  | Epoch | Step | test_cosine_ndcg@10 |
291
  |:-----:|:----:|:-------------------:|
292
- | -1 | -1 | 0.7776 |
293
 
294
 
295
  ### Framework Versions
 
83
  type: test
84
  metrics:
85
  - type: cosine_accuracy@1
86
+ value: 0.5953768980477223
87
  name: Cosine Accuracy@1
88
  - type: cosine_precision@1
89
+ value: 0.5953768980477223
90
  name: Cosine Precision@1
91
  - type: cosine_recall@1
92
+ value: 0.5778879609728815
93
  name: Cosine Recall@1
94
  - type: cosine_ndcg@10
95
+ value: 0.7775436499957671
96
  name: Cosine Ndcg@10
97
  - type: cosine_mrr@1
98
+ value: 0.5953768980477223
99
  name: Cosine Mrr@1
100
  - type: cosine_map@100
101
+ value: 0.7274666565910912
102
  name: Cosine Map@100
103
  - type: cosine_auc_precision_cache_hit_ratio
104
+ value: 0.36387321267916206
105
  name: Cosine Auc Precision Cache Hit Ratio
106
  - type: cosine_auc_similarity_distribution
107
+ value: 0.15403918371209657
108
  name: Cosine Auc Similarity Distribution
109
  ---
110
 
 
169
  # Get the similarity scores for the embeddings
170
  similarities = model.similarity(embeddings, embeddings)
171
  print(similarities)
172
+ # tensor([[1.0000, 1.0000, 0.8251],
173
+ # [1.0000, 1.0000, 0.8251],
174
+ # [0.8251, 0.8251, 1.0000]])
175
  ```
176
 
177
  <!--
 
209
 
210
  | Metric | Value |
211
  |:-------------------------------------|:-----------|
212
+ | cosine_accuracy@1 | 0.5954 |
213
+ | cosine_precision@1 | 0.5954 |
214
+ | cosine_recall@1 | 0.5779 |
215
+ | **cosine_ndcg@10** | **0.7775** |
216
+ | cosine_mrr@1 | 0.5954 |
217
+ | cosine_map@100 | 0.7275 |
218
+ | cosine_auc_precision_cache_hit_ratio | 0.3639 |
219
  | cosine_auc_similarity_distribution | 0.154 |
220
 
221
  <!--
 
286
  }
287
  ```
288
 
289
+ ### Training Hyperparameters
290
+ #### Non-Default Hyperparameters
291
+
292
+ - `eval_strategy`: steps
293
+ - `per_device_train_batch_size`: 300
294
+ - `per_device_eval_batch_size`: 300
295
+ - `gradient_accumulation_steps`: 2
296
+ - `weight_decay`: 0.001
297
+ - `adam_beta2`: 0.98
298
+ - `adam_epsilon`: 1e-06
299
+ - `num_train_epochs`: 1
300
+ - `warmup_ratio`: 0.05
301
+ - `bf16`: True
302
+ - `dataloader_num_workers`: 4
303
+ - `dataloader_prefetch_factor`: 4
304
+ - `load_best_model_at_end`: True
305
+ - `optim`: stable_adamw
306
+ - `ddp_find_unused_parameters`: False
307
+ - `dataloader_persistent_workers`: True
308
+ - `push_to_hub`: True
309
+ - `hub_model_id`: redis/langcache-embed-v3
310
+ - `batch_sampler`: no_duplicates
311
+
312
+ #### All Hyperparameters
313
+ <details><summary>Click to expand</summary>
314
+
315
+ - `overwrite_output_dir`: False
316
+ - `do_predict`: False
317
+ - `eval_strategy`: steps
318
+ - `prediction_loss_only`: True
319
+ - `per_device_train_batch_size`: 300
320
+ - `per_device_eval_batch_size`: 300
321
+ - `per_gpu_train_batch_size`: None
322
+ - `per_gpu_eval_batch_size`: None
323
+ - `gradient_accumulation_steps`: 2
324
+ - `eval_accumulation_steps`: None
325
+ - `torch_empty_cache_steps`: None
326
+ - `learning_rate`: 5e-05
327
+ - `weight_decay`: 0.001
328
+ - `adam_beta1`: 0.9
329
+ - `adam_beta2`: 0.98
330
+ - `adam_epsilon`: 1e-06
331
+ - `max_grad_norm`: 1.0
332
+ - `num_train_epochs`: 1
333
+ - `max_steps`: -1
334
+ - `lr_scheduler_type`: linear
335
+ - `lr_scheduler_kwargs`: {}
336
+ - `warmup_ratio`: 0.05
337
+ - `warmup_steps`: 0
338
+ - `log_level`: passive
339
+ - `log_level_replica`: warning
340
+ - `log_on_each_node`: True
341
+ - `logging_nan_inf_filter`: True
342
+ - `save_safetensors`: True
343
+ - `save_on_each_node`: False
344
+ - `save_only_model`: False
345
+ - `restore_callback_states_from_checkpoint`: False
346
+ - `no_cuda`: False
347
+ - `use_cpu`: False
348
+ - `use_mps_device`: False
349
+ - `seed`: 42
350
+ - `data_seed`: None
351
+ - `jit_mode_eval`: False
352
+ - `use_ipex`: False
353
+ - `bf16`: True
354
+ - `fp16`: False
355
+ - `fp16_opt_level`: O1
356
+ - `half_precision_backend`: auto
357
+ - `bf16_full_eval`: False
358
+ - `fp16_full_eval`: False
359
+ - `tf32`: None
360
+ - `local_rank`: 0
361
+ - `ddp_backend`: None
362
+ - `tpu_num_cores`: None
363
+ - `tpu_metrics_debug`: False
364
+ - `debug`: []
365
+ - `dataloader_drop_last`: False
366
+ - `dataloader_num_workers`: 4
367
+ - `dataloader_prefetch_factor`: 4
368
+ - `past_index`: -1
369
+ - `disable_tqdm`: False
370
+ - `remove_unused_columns`: True
371
+ - `label_names`: None
372
+ - `load_best_model_at_end`: True
373
+ - `ignore_data_skip`: False
374
+ - `fsdp`: []
375
+ - `fsdp_min_num_params`: 0
376
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
377
+ - `fsdp_transformer_layer_cls_to_wrap`: None
378
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
379
+ - `parallelism_config`: None
380
+ - `deepspeed`: None
381
+ - `label_smoothing_factor`: 0.0
382
+ - `optim`: stable_adamw
383
+ - `optim_args`: None
384
+ - `adafactor`: False
385
+ - `group_by_length`: False
386
+ - `length_column_name`: length
387
+ - `ddp_find_unused_parameters`: False
388
+ - `ddp_bucket_cap_mb`: None
389
+ - `ddp_broadcast_buffers`: False
390
+ - `dataloader_pin_memory`: True
391
+ - `dataloader_persistent_workers`: True
392
+ - `skip_memory_metrics`: True
393
+ - `use_legacy_prediction_loop`: False
394
+ - `push_to_hub`: True
395
+ - `resume_from_checkpoint`: None
396
+ - `hub_model_id`: redis/langcache-embed-v3
397
+ - `hub_strategy`: every_save
398
+ - `hub_private_repo`: None
399
+ - `hub_always_push`: False
400
+ - `hub_revision`: None
401
+ - `gradient_checkpointing`: False
402
+ - `gradient_checkpointing_kwargs`: None
403
+ - `include_inputs_for_metrics`: False
404
+ - `include_for_metrics`: []
405
+ - `eval_do_concat_batches`: True
406
+ - `fp16_backend`: auto
407
+ - `push_to_hub_model_id`: None
408
+ - `push_to_hub_organization`: None
409
+ - `mp_parameters`:
410
+ - `auto_find_batch_size`: False
411
+ - `full_determinism`: False
412
+ - `torchdynamo`: None
413
+ - `ray_scope`: last
414
+ - `ddp_timeout`: 1800
415
+ - `torch_compile`: False
416
+ - `torch_compile_backend`: None
417
+ - `torch_compile_mode`: None
418
+ - `include_tokens_per_second`: False
419
+ - `include_num_input_tokens_seen`: False
420
+ - `neftune_noise_alpha`: None
421
+ - `optim_target_modules`: None
422
+ - `batch_eval_metrics`: False
423
+ - `eval_on_start`: False
424
+ - `use_liger_kernel`: False
425
+ - `liger_kernel_config`: None
426
+ - `eval_use_gather_object`: False
427
+ - `average_tokens_across_devices`: False
428
+ - `prompts`: None
429
+ - `batch_sampler`: no_duplicates
430
+ - `multi_dataset_batch_sampler`: proportional
431
+ - `router_mapping`: {}
432
+ - `learning_rate_mapping`: {}
433
+
434
+ </details>
435
+
436
  ### Training Logs
437
  | Epoch | Step | test_cosine_ndcg@10 |
438
  |:-----:|:----:|:-------------------:|
439
+ | -1 | -1 | 0.7775 |
440
 
441
 
442
  ### Framework Versions