File size: 25,726 Bytes
bdc96e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:29545
- loss:MultipleNegativesSymmetricRankingLoss
base_model: jxm/cde-small-v2
widget:
- source_sentence: In the context of the risk-based assessment of customers and business
    relationships, how should the overlap between customer risk assessment and CDD
    be managed to ensure both are completed effectively and in compliance with ADGM
    regulations?
  sentences:
  - 'DocumentID: 36 | PassageID: D.7. | Passage: Principle 7 – Scenario analysis of
    climate-related financial risks. Where appropriate, relevant financial firms should
    develop and implement climate-related scenario analysis frameworks, including
    stress testing, in a manner commensurate with their size, complexity, risk profile
    and nature of activities.

    '
  - 'DocumentID: 1 | PassageID: 7.Guidance.4. | Passage: The risk-based assessment
    of the customer and the proposed business relationship, Transaction or product
    required under this Chapter is required to be undertaken prior to the establishment
    of a business relationship with a customer. Because the risk rating assigned to
    a customer resulting from this assessment determines the level of CDD that must
    be undertaken for that customer, this process must be completed before the CDD
    is completed for the customer. The Regulator is aware that in practice there will
    often be some degree of overlap between the customer risk assessment and CDD.
    For example, a Relevant Person may undertake some aspects of CDD, such as identifying
    Beneficial Owners, when it performs a risk assessment of the customer. Conversely,
    a Relevant Person may also obtain relevant information as part of CDD which has
    an impact on its customer risk assessment. Where information obtained as part
    of CDD of a customer affects the risk rating of a customer, the change in risk
    rating should be reflected in the degree of CDD undertaken.'
  - 'DocumentID: 1 | PassageID: 9.1.2.Guidance.4. | Passage: Where the legislative
    framework of a jurisdiction (such as secrecy or data protection legislation) prevents
    a Relevant Person from having access to CDD information upon request without delay
    as referred to in Rule ‎9.1.1(3)(b), the Relevant Person should undertake the
    relevant CDD itself and should not seek to rely on the relevant third party.'
- source_sentence: Can you clarify the responsibilities of the Governing Body of a
    Relevant Person in establishing and maintaining AML/TFS policies and procedures,
    and how these should be documented and reviewed?
  sentences:
  - 'DocumentID: 28 | PassageID: 193) | Passage: SUPERVISION BY LISTING AUTHORITY

    Complaints or allegations of non-compliance by Reporting Entities

    If, as a result of the enquiry, the Listing Authority forms the view that the
    information is accurate, is Inside Information, and is not within exemption from
    Disclosure provided by Rule 7.2.2, the Listing Authority will ask the Reporting
    Entity to make a Disclosure about the matter under Rule 7.2.1.  If the information
    should have been Disclosed earlier, the Listing Authority may issue an ‘aware
    letter’ (see paragraphs 187 to 189 above), or take other relevant action.


    '
  - "DocumentID: 17 | PassageID: Part 13.165.(2) | Passage: The Regulator shall not\
    \ approve a Non Abu Dhabi Global Market Clearing House unless it is satisfied—\n\
    (a)\tthat the rules and practices of the body, together with the law of the country\
    \ in which the body's head office is situated, provide adequate procedures for\
    \ dealing with the default of persons party to contracts connected with the body;\
    \ and\n(b)\tthat it is otherwise appropriate to approve the body;\ntogether being\
    \ the “Relevant Requirements” for this Part."
  - "DocumentID: 1 | PassageID: 4.3.1 | Passage: A Relevant Person which is part of\
    \ a Group must ensure that it:\n(a)\thas developed and implemented policies and\
    \ procedures for the sharing of information between Group entities, including\
    \ the sharing of information relating to CDD and money laundering risks;\n(b)\t\
    has in place adequate safeguards on the confidentiality and use of information\
    \ exchanged between Group entities, including consideration of relevant data protection\
    \ legislation;\n(c)\tremains aware of the money laundering risks of the Group\
    \ as a whole and of its exposure to the Group and takes active steps to mitigate\
    \ such risks;\n(d)\tcontributes to a Group-wide risk assessment to identify and\
    \ assess money laundering risks for the Group; and\n(e)\tprovides its Group-wide\
    \ compliance, audit and AML/TFS functions with customer account and Transaction\
    \ information from its Branches and Subsidiaries when necessary for AML/TFS purposes."
- source_sentence: What specific accounting standards and practices are we required
    to follow when valuing positions in our Trading and Non-Trading Books to ensure
    compliance with ADGM regulations?
  sentences:
  - 'DocumentID: 7 | PassageID: 8.10.1.(2).Guidance.3. | Passage: Each Authorised
    Person, Recognised Body and its Auditors is also required under Part 16 and section
    193 of the FSMR respectively, to disclose to the Regulator any matter which may
    indicate a breach or likely breach of, or a failure or likely failure to comply
    with, Regulations or Rules. Each Authorised Person and Recognised Body is also
    required to establish and implement systems and procedures to enable its compliance
    and compliance by its Auditors with notification requirements.

    '
  - "DocumentID: 18 | PassageID: 3.2 | Passage: Financial Services Permissions. VC\
    \ Managers operating in ADGM require a Financial Services Permission (“FSP”) to\
    \ undertake any Regulated Activity pertaining to VC Funds and/or co-investments\
    \ by third parties in VC Funds. The Regulated Activities covered by the FSP will\
    \ be dependent on the VC Managers’ investment strategy and business model.\n(a)\t\
    Managing a Collective Investment Fund: this includes carrying out fund management\
    \ activities in respect of a VC Fund.\n(b)\tAdvising on Investments or Credit\
    \ : for VC Managers these activities will be restricted to activities related\
    \ to co-investment alongside a VC Fund which the VC Manager manages, such as recommending\
    \ that a client invest in an investee company alongside the VC Fund and on the\
    \ strategy and structure required to make the investment.\n(c)\tArranging Deals\
    \ in Investments: VC Managers may also wish to make arrangements to facilitate\
    \ co-investments in the investee company.\nAuthorisation fees and supervision\
    \ fees for a VC Manager are capped at USD 10,000 regardless of whether one or\
    \ both of the additional Regulated Activities in b) and c) above in relation to\
    \ co-investments are included in its FSP. The FSP will include restrictions appropriate\
    \ to the business model of a VC Manager."
  - 'DocumentID: 13 | PassageID: APP2.A2.1.1.(4) | Passage: An Authorised Person must
    value every position included in its Trading Book and the Non Trading Book in
    accordance with the relevant accounting standards and practices.

    '
- source_sentence: What documentation and information are we required to maintain
    to demonstrate compliance with the rules pertaining to the cooperation with auditors,
    especially in terms of providing access and not interfering with their duties?
  sentences:
  - "DocumentID: 6 | PassageID: PART 5.16.3.5 | Passage: Co-operation with auditors.\
    \ A Fund Manager must take reasonable steps to ensure that it and its Employees:\n\
    (a)\tprovide any information to its auditor that its auditor reasonably requires,\
    \ or is entitled to receive as auditor;\n(b)\tgive the auditor right of access\
    \ at all reasonable times to relevant records and information within its possession;\n\
    (c)\tallow the auditor to make copies of any records or information referred to\
    \ in ‎(b);\n(d)\tdo not interfere with the auditor's ability to discharge its\
    \ duties;\n(e)\treport to the auditor any matter which may significantly affect\
    \ the financial position of the Fund; and\n(f)\tprovide such other assistance\
    \ as the auditor may reasonably request it to provide."
  - "DocumentID: 13 | PassageID: 4.3.1 | Passage: An Authorised Person must implement\
    \ and maintain comprehensive Credit Risk management systems which:\n(a)\tare appropriate\
    \ to the firm's type, scope, complexity and scale of operations;\n(b)\tare appropriate\
    \ to the diversity of its operations, including geographical diversity;\n(c)\t\
    enable the firm to effectively identify, assess, monitor and control Credit Risk\
    \ and to ensure that adequate Capital Resources are available at all times to\
    \ cover the risks assumed; and\n(d)\tensure effective implementation of the Credit\
    \ Risk strategy and policy."
  - 'DocumentID: 3 | PassageID: 3.8.9 | Passage: The Authorised Person acting as the
    Investment Manager of an ADGM Green Portfolio must provide a copy of the attestation
    obtained for the purposes of Rule ‎3.8.6 to each Client with whom it has entered
    into a Discretionary Portfolio Management Agreement in respect of such ADGM Green
    Portfolio at least on an annual basis and upon request by the Client.'
- source_sentence: Could you provide examples of circumstances that, when changed,
    would necessitate the reevaluation of a customer's risk assessment and the application
    of updated CDD measures?
  sentences:
  - 'DocumentID: 13 | PassageID: 9.2.1.Guidance.1. | Passage: The Regulator expects
    that an Authorised Person''s Liquidity Risk strategy will set out the approach
    that the Authorised Person will take to Liquidity Risk management, including various
    quantitative and qualitative targets. It should be communicated to all relevant
    functions and staff within the organisation and be set out in the Authorised Person''s
    Liquidity Risk policy.'
  - "DocumentID: 1 | PassageID: 8.1.2.(1) | Passage: A Relevant Person must also apply\
    \ CDD measures to each existing customer under Rules ‎8.3.1, ‎8.4.1 or ‎8.5.1\
    \ as applicable:\n(a)\twith a frequency appropriate to the outcome of the risk-based\
    \ approach taken in relation to each customer; and\n(b)\twhen the Relevant Person\
    \ becomes aware that any circumstances relevant to its risk assessment for a customer\
    \ have changed."
  - "DocumentID: 1 | PassageID: 8.1.1.Guidance.2. | Passage: The FIU has issued guides\
    \ that require:\n(a)\ta DNFBP that is a dealer in precious metals or precious\
    \ stones to obtain relevant identification documents, such as passport, emirates\
    \ ID, trade licence, as applicable, and register the information via goAML for\
    \ all cash transactions equal to or exceeding USD15,000 with individuals and all\
    \ cash or wire transfer transactions equal to or exceeding USD15,000 with entities.\
    \ The Regulator expects a dealer in any saleable item or a price equal to or greater\
    \ than USD15,000 to also comply with this requirement;\n(b)\ta DNFBP that is a\
    \ real estate agent to obtain relevant identification documents, such as passport,\
    \ emirates ID, trade licence, as applicable, and register the information via\
    \ goAML for all sales or purchases of Real Property where:\n(i)\tthe payment for\
    \ the sale/purchase includes a total cash payment of USD15,000 or more whether\
    \ in a single cash payment or multiple cash payments;\n(ii)\tthe payment for any\
    \ part or all of the sale/purchase amount includes payment(s) using Virtual Assets;\n\
    (iii)\tthe payment for any part or all of the sale/purchase amount includes funds\
    \ that were converted from or to a Virtual Asset."
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---

# SentenceTransformer based on jxm/cde-small-v2

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [jxm/cde-small-v2](https://huggingface.co/jxm/cde-small-v2) on the csv dataset. It maps sentences & paragraphs to a None-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [jxm/cde-small-v2](https://huggingface.co/jxm/cde-small-v2) <!-- at revision 287bf0ea6ebfecf2339762d0ef28fb846959a8f2 -->
- **Maximum Sequence Length:** None tokens
- **Output Dimensionality:** None dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - csv
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({}) with Transformer model: ContextualDocumentEmbeddingTransformer 
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("jebish7/cde-v2-obliqa-1")
# Run inference
sentences = [
    "Could you provide examples of circumstances that, when changed, would necessitate the reevaluation of a customer's risk assessment and the application of updated CDD measures?",
    'DocumentID: 1 | PassageID: 8.1.2.(1) | Passage: A Relevant Person must also apply CDD measures to each existing customer under Rules \u200e8.3.1, \u200e8.4.1 or \u200e8.5.1 as applicable:\n(a)\twith a frequency appropriate to the outcome of the risk-based approach taken in relation to each customer; and\n(b)\twhen the Relevant Person becomes aware that any circumstances relevant to its risk assessment for a customer have changed.',
    "DocumentID: 13 | PassageID: 9.2.1.Guidance.1. | Passage: The Regulator expects that an Authorised Person's Liquidity Risk strategy will set out the approach that the Authorised Person will take to Liquidity Risk management, including various quantitative and qualitative targets. It should be communicated to all relevant functions and staff within the organisation and be set out in the Authorised Person's Liquidity Risk policy.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### csv

* Dataset: csv
* Size: 29,545 training samples
* Columns: <code>anchor</code> and <code>positive</code>
* Approximate statistics based on the first 1000 samples:
  |         | anchor                                                                             | positive                                                                             |
  |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
  | type    | string                                                                             | string                                                                               |
  | details | <ul><li>min: 17 tokens</li><li>mean: 35.21 tokens</li><li>max: 66 tokens</li></ul> | <ul><li>min: 29 tokens</li><li>mean: 143.53 tokens</li><li>max: 512 tokens</li></ul> |
* Samples:
  | anchor                                                                                                                                                                              | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
  |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>Could you outline the expected procedures for a Trade Repository to notify relevant authorities of any significant errors or omissions in previously submitted data?</code>   | <code>DocumentID: 7 | PassageID: APP2.A2.1.2 | Passage: Processes and procedures. A Trade Repository must have effective processes and procedures to provide data to relevant authorities in a timely and appropriate manner to enable them to meet their respective regulatory mandates and legal responsibilities.</code>                                                                                                                                                            |
  | <code>In the context of a non-binding MPO, how are commodities held by an Authorised Person treated for the purpose of determining the Commodities Risk Capital Requirement?</code> | <code>DocumentID: 9 | PassageID: 5.4.13.(a) | Passage: Commodities held by an Authorised Person for selling or leasing when executing a Murabaha, non-binding MPO, Salam or parallel Salam contract must be included in the calculation of its Commodities Risk Capital Requirement.</code>                                                                                                                                                                                            |
  | <code>Can the FSRA provide case studies or examples of best practices for RIEs operating MTFs or OTFs using spot commodities in line with the Spot Commodities Framework?</code>    | <code>DocumentID: 34 | PassageID: 77) | Passage: REGULATORY REQUIREMENTS - SPOT COMMODITY ACTIVITIES<br>RIEs operating an MTF or OTF using Accepted Spot Commodities<br>This means that an RIE (in addition to operating markets relating to the trading of Financial Instruments) can, where permitted by the FSRA and subject to MIR Rule 3.4.2, operate a separate MTF or OTF under its Recognition Order.  This MTF or OTF may operate using Accepted Spot Commodities.<br></code> |
* Loss: [<code>MultipleNegativesSymmetricRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativessymmetricrankingloss) with these parameters:
  ```json
  {
      "scale": 20.0,
      "similarity_fct": "cos_sim"
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `per_device_train_batch_size`: 12
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: no
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 12
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
| Epoch  | Step | Training Loss |
|:------:|:----:|:-------------:|
| 0.0812 | 100  | 1.7126        |
| 0.1623 | 200  | 0.7412        |
| 0.2435 | 300  | 0.6673        |
| 0.3247 | 400  | 0.6119        |
| 0.4058 | 500  | 0.5413        |
| 0.4870 | 600  | 0.5807        |
| 0.5682 | 700  | 0.506         |
| 0.6494 | 800  | 0.5132        |
| 0.7305 | 900  | 0.4641        |
| 0.8117 | 1000 | 0.456         |
| 0.8929 | 1100 | 0.4954        |
| 0.9740 | 1200 | 0.4088        |


### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.1
- Transformers: 4.48.3
- PyTorch: 2.5.1+cu121
- Accelerate: 1.2.1
- Datasets: 3.3.2
- Tokenizers: 0.21.0

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->