Add new SentenceTransformer model.

Browse files

Files changed (11) hide show

1_Pooling/config.json +10 -0
README.md +571 -0
config.json +31 -0
config_sentence_transformers.json +10 -0
model.safetensors +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +57 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 384,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,571 @@

+---
+base_model: BAAI/bge-small-en-v1.5
+library_name: sentence-transformers
+metrics:
+- cosine_accuracy
+- dot_accuracy
+- manhattan_accuracy
+- euclidean_accuracy
+- max_accuracy
+pipeline_tag: sentence-similarity
+tags:
+- sentence-transformers
+- sentence-similarity
+- feature-extraction
+- generated_from_trainer
+- dataset_size:60341
+- loss:MultipleNegativesRankingLoss
+widget:
+- source_sentence: What is the focus of the research conducted by the MHCI x 99P Labs
+    Capstone Team?
+  sentences:
+  - To determine the destination of a given car based on an initial start position
+    in time, we developed a Markov Model. We then creatively combined DBScan, K-NN,
+    and XGboost algorithms to achieve accurate dwell time forecasts.
+  - Transportation networks touch all three pillars of sustainability. They shape
+    our daily lives by connecting us to work, retail, and recreation; however, a system
+    that does not connect everyone equitably reproduces social disparities.
+  - 'Two weeks of digging deep into exploratory, generative research
+    Written by the MHCI x 99P Labs Capstone TeamEdited by 99P Labs
+    The MHCI x 99P Labs Capstone Team is part of the Master of Human-Computer Interaction
+    (MHCI) program at Carnegie Mellon University.'
+- source_sentence: What limits are being considered for data quality checks?
+  sentences:
+  - Unlike many other Agile teams, we don t do a Retro every sprint, mostly because
+    we do one-week sprints.
+  - Our team has been exploring implementing data quality checks into our data platform.
+    We ve been trying to establish our goals, limits, and expectations, some of which
+    were discussed in Part 1 of our Data Quality blog posts.
+  - Literature and Topical ReviewEach team member performed a literature review on
+    telematics research, identifying its applications, methodologies, and critical
+    insights.
+- source_sentence: What are the potential consequences of not researching before coding?
+  sentences:
+  - This indicates a degree of variance in the model s accuracy across different times
+    and conditions.
+  - In order to objectively test ourselves on the knowledge we ve gained, we decide
+    to take a quiz. The quiz contains 50 images of either dogs or cats and we have
+    to determine which animal the image most closely resembles.
+  - To reiterate, before even writing any code, it s important to do proper research
+    into your team s documentation and online resources. A lot of time can be saved
+    by reusing code that can adapt to your use case instead of starting from scratch
+    every time.
+- source_sentence: What might be the implications of having a performance of 3%?
+  sentences:
+  - Then, I will highlight the top three winning projects from each track.
+  - Channels can be used only by organizations that are invited to the channel and
+    are invisible to other members of the network. Each channel has a separate blockchain
+    ledger.
+  - 3%, only slightly better than the worst-performing model, K Nearest Neighbors.
+- source_sentence: In what context is traffic flow theory typically discussed?
+  sentences:
+  - As a result, I was familiar with many terms discussed conceptually but I discovered
+    some of the more official terminology used when discussing traffic flow theory
+    and network control.
+  - We called it plus-deltas (+/ ). Seeing the output and outcomes we accomplished
+    in each session was encouraging and allowed us to acknowledge things we did that
+    made us successful so we could carry it on to the next session.
+  - There are different types of projects within C.
+model-index:
+- name: SentenceTransformer based on BAAI/bge-small-en-v1.5
+  results:
+  - task:
+      type: triplet
+      name: Triplet
+    dataset:
+      name: 99GPT Finetuning Embedding test 01
+      type: 99GPT-Finetuning-Embedding-test-01
+    metrics:
+    - type: cosine_accuracy
+      value: 0.9987405541561712
+      name: Cosine Accuracy
+    - type: dot_accuracy
+      value: 0.0011931592204693093
+      name: Dot Accuracy
+    - type: manhattan_accuracy
+      value: 0.9987405541561712
+      name: Manhattan Accuracy
+    - type: euclidean_accuracy
+      value: 0.9987405541561712
+      name: Euclidean Accuracy
+    - type: max_accuracy
+      value: 0.9987405541561712
+      name: Max Accuracy
+---
+# SentenceTransformer based on BAAI/bge-small-en-v1.5
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
+## Model Details
+### Model Description
+- **Model Type:** Sentence Transformer
+- **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
+- **Maximum Sequence Length:** 512 tokens
+- **Output Dimensionality:** 384 tokens
+- **Similarity Function:** Cosine Similarity
+<!-- - **Training Dataset:** Unknown -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
+### Full Model Architecture
+```
+SentenceTransformer(
+  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
+  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
+  (2): Normalize()
+)
+```
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import SentenceTransformer
+# Download from the 🤗 Hub
+model = SentenceTransformer("marroyo777/bge-99GPT-v1")
+# Run inference
+sentences = [
+    'In what context is traffic flow theory typically discussed?',
+    'As a result, I was familiar with many terms discussed conceptually but I discovered some of the more official terminology used when discussing traffic flow theory and network control.',
+    'There are different types of projects within C.',
+]
+embeddings = model.encode(sentences)
+print(embeddings.shape)
+# [3, 384]
+# Get the similarity scores for the embeddings
+similarities = model.similarity(embeddings, embeddings)
+print(similarities.shape)
+# [3, 3]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Triplet
+* Dataset: `99GPT-Finetuning-Embedding-test-01`
+* Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
+| Metric             | Value      |
+|:-------------------|:-----------|
+| cosine_accuracy    | 0.9987     |
+| dot_accuracy       | 0.0012     |
+| manhattan_accuracy | 0.9987     |
+| euclidean_accuracy | 0.9987     |
+| **max_accuracy**   | **0.9987** |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### Unnamed Dataset
+* Size: 60,341 training samples
+* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                            | positive                                                                           | negative                                                                           |
+  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                             | string                                                                             |
+  | details | <ul><li>min: 7 tokens</li><li>mean: 13.77 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 40.26 tokens</li><li>max: 123 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 39.24 tokens</li><li>max: 139 tokens</li></ul> |
+* Samples:
+  | anchor                                                                   | positive                                                                                                                                                                                                                                                                                                                                                                            | negative                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+  |:-------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>Who is being invited to join the initiative?</code>                | <code>Our belief is that the research community will be able to gain access to diverse and real-time data with minimal friction, build exciting innovations and make an impact to Data and AI technologies as well. This is just the first release and we are inviting the research community to join us to build exciting data-driven mobility & energy solutions together.</code> | <code>Burning it destroys the oil. Once you burn the oil, that particular oil ceases to exist.</code>                                                                                                                                                                                                                                                                                                                                             |
+  | <code>What is the main focus of the research conducted for Orbit?</code> | <code>Orbit holds the culmination of almost a year of research with participants from a wide variety of backgrounds, needs, and jobs to be done.</code>                                                                                                                                                                                                                             | <code>So how do you win a hackathon mobility challenge? The SmartRoute team showed two of them.</code>                                                                                                                                                                                                                                                                                                                                            |
+  | <code>What role do LLMs play in HRI's strategy?</code>                   | <code>We are excited about the potential of JournAI to transform mobility. By harnessing the power of LLMs and other AI technologies, HRI is driving towards a more connected, efficient, and sustainable future.</code>                                                                                                                                                            | <code>This simplified the process for users, who only had to pull and run the docker image to spawn a Jupyterlab app on their machine, open it in their browser, and create a new Pyspark notebook that automatically connected to our spark cluster. Our new workflow allows data science teams to configure their spark jobs and compute resources with options to request memory and CPU from the cluster and customize spark settings.</code> |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Evaluation Dataset
+#### Unnamed Dataset
+* Size: 15,086 evaluation samples
+* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                            | positive                                                                           | negative                                                                          |
+  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                             | string                                                                            |
+  | details | <ul><li>min: 6 tokens</li><li>mean: 13.73 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 39.51 tokens</li><li>max: 131 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 36.9 tokens</li><li>max: 153 tokens</li></ul> |
+* Samples:
+  | anchor                                                                                                            | positive                                                                                                                                                                                         | negative                                                                                                                                                                                                                                |
+  |:------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>What does the text suggest about the balance between creating tools and their practical application?</code> | <code>From technology to healthcare, these examples underline the importance of the interplay between theory and practice, between creating advanced tools and applying them effectively.</code> | <code>We found success when leaving the later panels empty as opposed to earlier ones. If we established a clear context and pain point for participants, they were often able to fill in a solution and resolution themselves.</code>  |
+  | <code>Who are the personas mentioned in the text?</code>                                                          | <code>Our derived data sets are created based on personas that we have identified and their data access needs.</code>                                                                            | <code>However there still exists a need to connect the map matched nodes that are outputted from the libraries to specific data points from the V2X data, in order to get the rest of the V2X features in a specific time frame.</code> |
+  | <code>Is this the first or second hackathon mentioned?</code>                                                     | <code>Up next is the first of two hackathons we participated in at Ohio State University.</code>                                                                                                 | <code>The team did a great job by targeting a pervasive issue in such an intuitive way.</code>                                                                                                                                          |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `warmup_ratio`: 0.1
+- `fp16`: True
+- `batch_sampler`: no_duplicates
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 5e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 3
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 42
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: False
+- `fp16`: True
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: False
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: False
+- `hub_always_push`: False
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `dispatch_batches`: None
+- `split_batches`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `eval_use_gather_object`: False
+- `batch_sampler`: no_duplicates
+- `multi_dataset_batch_sampler`: proportional
+</details>
+### Training Logs
+<details><summary>Click to expand</summary>
+| Epoch  | Step  | Training Loss | loss   | 99GPT-Finetuning-Embedding-test-01_max_accuracy |
+|:------:|:-----:|:-------------:|:------:|:-----------------------------------------------:|
+| 0.0265 | 100   | 0.7653        | 0.4309 | -                                               |
+| 0.0530 | 200   | 0.4795        | 0.2525 | -                                               |
+| 0.0795 | 300   | 0.3416        | 0.1996 | -                                               |
+| 0.1060 | 400   | 0.2713        | 0.1699 | -                                               |
+| 0.1326 | 500   | 0.2271        | 0.1558 | -                                               |
+| 0.1591 | 600   | 0.2427        | 0.1510 | -                                               |
+| 0.1856 | 700   | 0.2188        | 0.1414 | -                                               |
+| 0.2121 | 800   | 0.1936        | 0.1350 | -                                               |
+| 0.2386 | 900   | 0.2174        | 0.1370 | -                                               |
+| 0.2651 | 1000  | 0.2104        | 0.1265 | -                                               |
+| 0.2916 | 1100  | 0.2142        | 0.1324 | -                                               |
+| 0.3181 | 1200  | 0.2088        | 0.1297 | -                                               |
+| 0.3446 | 1300  | 0.1865        | 0.1240 | -                                               |
+| 0.3712 | 1400  | 0.177         | 0.1221 | -                                               |
+| 0.3977 | 1500  | 0.1735        | 0.1296 | -                                               |
+| 0.4242 | 1600  | 0.1746        | 0.1188 | -                                               |
+| 0.4507 | 1700  | 0.1639        | 0.1178 | -                                               |
+| 0.4772 | 1800  | 0.1958        | 0.1105 | -                                               |
+| 0.5037 | 1900  | 0.1874        | 0.1152 | -                                               |
+| 0.5302 | 2000  | 0.1676        | 0.1143 | -                                               |
+| 0.5567 | 2100  | 0.1671        | 0.1067 | -                                               |
+| 0.5832 | 2200  | 0.142         | 0.1154 | -                                               |
+| 0.6098 | 2300  | 0.1668        | 0.1150 | -                                               |
+| 0.6363 | 2400  | 0.1605        | 0.1091 | -                                               |
+| 0.6628 | 2500  | 0.1475        | 0.1096 | -                                               |
+| 0.6893 | 2600  | 0.1668        | 0.1066 | -                                               |
+| 0.7158 | 2700  | 0.166         | 0.1067 | -                                               |
+| 0.7423 | 2800  | 0.1611        | 0.0999 | -                                               |
+| 0.7688 | 2900  | 0.1747        | 0.1001 | -                                               |
+| 0.7953 | 3000  | 0.1436        | 0.1065 | -                                               |
+| 0.8218 | 3100  | 0.1579        | 0.0992 | -                                               |
+| 0.8484 | 3200  | 0.1718        | 0.1006 | -                                               |
+| 0.8749 | 3300  | 0.1567        | 0.0995 | -                                               |
+| 0.9014 | 3400  | 0.1634        | 0.0954 | -                                               |
+| 0.9279 | 3500  | 0.1441        | 0.0956 | -                                               |
+| 0.9544 | 3600  | 0.1433        | 0.0991 | -                                               |
+| 0.9809 | 3700  | 0.1562        | 0.0931 | -                                               |
+| 1.0074 | 3800  | 0.1421        | 0.0931 | -                                               |
+| 1.0339 | 3900  | 0.1424        | 0.0956 | -                                               |
+| 1.0604 | 4000  | 0.128         | 0.0900 | -                                               |
+| 1.0870 | 4100  | 0.1265        | 0.0921 | -                                               |
+| 1.1135 | 4200  | 0.1062        | 0.0944 | -                                               |
+| 1.1400 | 4300  | 0.1221        | 0.0900 | -                                               |
+| 1.1665 | 4400  | 0.1091        | 0.0944 | -                                               |
+| 1.1930 | 4500  | 0.091         | 0.0913 | -                                               |
+| 1.2195 | 4600  | 0.0823        | 0.0935 | -                                               |
+| 1.2460 | 4700  | 0.0946        | 0.0949 | -                                               |
+| 1.2725 | 4800  | 0.0803        | 0.0890 | -                                               |
+| 1.2990 | 4900  | 0.0796        | 0.0885 | -                                               |
+| 1.3256 | 5000  | 0.0699        | 0.0921 | -                                               |
+| 1.3521 | 5100  | 0.073         | 0.0909 | -                                               |
+| 1.3786 | 5200  | 0.0608        | 0.0934 | -                                               |
+| 1.4051 | 5300  | 0.07          | 0.0941 | -                                               |
+| 1.4316 | 5400  | 0.0732        | 0.0896 | -                                               |
+| 1.4581 | 5500  | 0.0639        | 0.0910 | -                                               |
+| 1.4846 | 5600  | 0.0722        | 0.0874 | -                                               |
+| 1.5111 | 5700  | 0.0635        | 0.0925 | -                                               |
+| 1.5376 | 5800  | 0.0631        | 0.0887 | -                                               |
+| 1.5642 | 5900  | 0.0589        | 0.0896 | -                                               |
+| 1.5907 | 6000  | 0.0636        | 0.0925 | -                                               |
+| 1.6172 | 6100  | 0.0702        | 0.0938 | -                                               |
+| 1.6437 | 6200  | 0.0572        | 0.0921 | -                                               |
+| 1.6702 | 6300  | 0.0516        | 0.0946 | -                                               |
+| 1.6967 | 6400  | 0.0695        | 0.0902 | -                                               |
+| 1.7232 | 6500  | 0.0632        | 0.0917 | -                                               |
+| 1.7497 | 6600  | 0.0697        | 0.0832 | -                                               |
+| 1.7762 | 6700  | 0.0747        | 0.0853 | -                                               |
+| 1.8028 | 6800  | 0.0615        | 0.0892 | -                                               |
+| 1.8293 | 6900  | 0.0747        | 0.0855 | -                                               |
+| 1.8558 | 7000  | 0.0668        | 0.0848 | -                                               |
+| 1.8823 | 7100  | 0.0747        | 0.0853 | -                                               |
+| 1.9088 | 7200  | 0.0774        | 0.0847 | -                                               |
+| 1.9353 | 7300  | 0.0546        | 0.0874 | -                                               |
+| 1.9618 | 7400  | 0.0708        | 0.0879 | -                                               |
+| 1.9883 | 7500  | 0.0632        | 0.0863 | -                                               |
+| 2.0148 | 7600  | 0.0601        | 0.0873 | -                                               |
+| 2.0414 | 7700  | 0.063         | 0.0870 | -                                               |
+| 2.0679 | 7800  | 0.0646        | 0.0819 | -                                               |
+| 2.0944 | 7900  | 0.0557        | 0.0825 | -                                               |
+| 2.1209 | 8000  | 0.0444        | 0.0841 | -                                               |
+| 2.1474 | 8100  | 0.049         | 0.0825 | -                                               |
+| 2.1739 | 8200  | 0.0441        | 0.0845 | -                                               |
+| 2.2004 | 8300  | 0.0451        | 0.0844 | -                                               |
+| 2.2269 | 8400  | 0.0346        | 0.0851 | -                                               |
+| 2.2534 | 8500  | 0.0398        | 0.0847 | -                                               |
+| 2.2800 | 8600  | 0.033         | 0.0855 | -                                               |
+| 2.3065 | 8700  | 0.0355        | 0.0851 | -                                               |
+| 2.3330 | 8800  | 0.0313        | 0.0867 | -                                               |
+| 2.3595 | 8900  | 0.0358        | 0.0870 | -                                               |
+| 2.3860 | 9000  | 0.0251        | 0.0867 | -                                               |
+| 2.4125 | 9100  | 0.0395        | 0.0854 | -                                               |
+| 2.4390 | 9200  | 0.0322        | 0.0838 | -                                               |
+| 2.4655 | 9300  | 0.0355        | 0.0847 | -                                               |
+| 2.4920 | 9400  | 0.034         | 0.0834 | -                                               |
+| 2.5186 | 9500  | 0.0345        | 0.0862 | -                                               |
+| 2.5451 | 9600  | 0.0272        | 0.0830 | -                                               |
+| 2.5716 | 9700  | 0.0275        | 0.0831 | -                                               |
+| 2.5981 | 9800  | 0.0345        | 0.0849 | -                                               |
+| 2.6246 | 9900  | 0.0289        | 0.0849 | -                                               |
+| 2.6511 | 10000 | 0.0282        | 0.0860 | -                                               |
+| 2.6776 | 10100 | 0.0279        | 0.0885 | -                                               |
+| 2.7041 | 10200 | 0.0344        | 0.0865 | -                                               |
+| 2.7306 | 10300 | 0.0326        | 0.0863 | -                                               |
+| 2.7572 | 10400 | 0.0383        | 0.0840 | -                                               |
+| 2.7837 | 10500 | 0.0338        | 0.0833 | -                                               |
+| 2.8102 | 10600 | 0.0298        | 0.0836 | -                                               |
+| 2.8367 | 10700 | 0.0402        | 0.0825 | -                                               |
+| 2.8632 | 10800 | 0.0361        | 0.0822 | -                                               |
+| 2.8897 | 10900 | 0.0388        | 0.0818 | -                                               |
+| 2.9162 | 11000 | 0.0347        | 0.0821 | -                                               |
+| 2.9427 | 11100 | 0.0341        | 0.0826 | -                                               |
+| 2.9692 | 11200 | 0.0373        | 0.0825 | -                                               |
+| 2.9958 | 11300 | 0.0354        | 0.0824 | -                                               |
+| 3.0    | 11316 | -             | -      | 0.9987                                          |
+</details>
+### Framework Versions
+- Python: 3.10.12
+- Sentence Transformers: 3.1.1
+- Transformers: 4.44.2
+- PyTorch: 2.4.1+cu121
+- Accelerate: 0.34.2
+- Datasets: 3.0.1
+- Tokenizers: 0.19.1
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_name_or_path": "BAAI/bge-small-en-v1.5",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 1536,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.2",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.1.1",
+    "transformers": "4.44.2",
+    "pytorch": "2.4.1+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed9d49aac920e2fde08358340498be2341c504c2e22a3471fb71b49d68f50c78
+size 133462128

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": true
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff