Milutin Studen commited on
Commit
4582829
·
verified ·
1 Parent(s): bfc4648

Add new CrossEncoder model

Browse files
Files changed (6) hide show
  1. README.md +467 -0
  2. config.json +53 -0
  3. model.safetensors +3 -0
  4. special_tokens_map.json +37 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +945 -0
README.md ADDED
@@ -0,0 +1,467 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - sentence-transformers
6
+ - cross-encoder
7
+ - text-classification
8
+ - generated_from_trainer
9
+ - dataset_size:82326
10
+ - loss:ListNetLoss
11
+ base_model: answerdotai/ModernBERT-base
12
+ datasets:
13
+ - microsoft/ms_marco
14
+ pipeline_tag: text-classification
15
+ library_name: sentence-transformers
16
+ metrics:
17
+ - map
18
+ - mrr@10
19
+ - ndcg@10
20
+ model-index:
21
+ - name: CrossEncoder based on answerdotai/ModernBERT-base
22
+ results: []
23
+ ---
24
+
25
+ # CrossEncoder based on answerdotai/ModernBERT-base
26
+
27
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
28
+
29
+ ## Model Details
30
+
31
+ ### Model Description
32
+ - **Model Type:** Cross Encoder
33
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
34
+ - **Maximum Sequence Length:** 8192 tokens
35
+ - **Number of Output Labels:** 1 label
36
+ - **Training Dataset:**
37
+ - [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco)
38
+ - **Language:** en
39
+ <!-- - **License:** Unknown -->
40
+
41
+ ### Model Sources
42
+
43
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
44
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
45
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
46
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
47
+
48
+ ## Usage
49
+
50
+ ### Direct Usage (Sentence Transformers)
51
+
52
+ First install the Sentence Transformers library:
53
+
54
+ ```bash
55
+ pip install -U sentence-transformers
56
+ ```
57
+
58
+ Then you can load this model and run inference.
59
+ ```python
60
+ from sentence_transformers import CrossEncoder
61
+
62
+ # Download from the 🤗 Hub
63
+ model = CrossEncoder("Studeni/reranker-msmarco-v1.1-ModernBERT-base-listnet")
64
+ # Get scores for pairs of texts
65
+ pairs = [
66
+ ['How many calories in an egg', 'There are on average between 55 and 80 calories in an egg depending on its size.'],
67
+ ['How many calories in an egg', 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.'],
68
+ ['How many calories in an egg', 'Most of the calories in an egg come from the yellow yolk in the center.'],
69
+ ]
70
+ scores = model.predict(pairs)
71
+ print(scores.shape)
72
+ # (3,)
73
+
74
+ # Or rank different texts based on similarity to a single text
75
+ ranks = model.rank(
76
+ 'How many calories in an egg',
77
+ [
78
+ 'There are on average between 55 and 80 calories in an egg depending on its size.',
79
+ 'Egg whites are very low in calories, have no fat, no cholesterol, and are loaded with protein.',
80
+ 'Most of the calories in an egg come from the yellow yolk in the center.',
81
+ ]
82
+ )
83
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
84
+ ```
85
+
86
+ <!--
87
+ ### Direct Usage (Transformers)
88
+
89
+ <details><summary>Click to see the direct usage in Transformers</summary>
90
+
91
+ </details>
92
+ -->
93
+
94
+ <!--
95
+ ### Downstream Usage (Sentence Transformers)
96
+
97
+ You can finetune this model on your own dataset.
98
+
99
+ <details><summary>Click to expand</summary>
100
+
101
+ </details>
102
+ -->
103
+
104
+ <!--
105
+ ### Out-of-Scope Use
106
+
107
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
108
+ -->
109
+
110
+ ## Evaluation
111
+
112
+ ### Metrics
113
+
114
+ #### Cross Encoder Reranking
115
+
116
+ * Datasets: `NanoMSMARCO`, `NanoNFCorpus` and `NanoNQ`
117
+ * Evaluated with [<code>CERerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CERerankingEvaluator)
118
+
119
+ | Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ |
120
+ |:------------|:---------------------|:---------------------|:---------------------|
121
+ | map | 0.4674 (-0.0222) | 0.3153 (+0.0449) | 0.5727 (+0.1520) |
122
+ | mrr@10 | 0.4580 (-0.0195) | 0.4976 (-0.0023) | 0.5714 (+0.1447) |
123
+ | **ndcg@10** | **0.5335 (-0.0069)** | **0.3530 (+0.0280)** | **0.6278 (+0.1272)** |
124
+
125
+ #### Cross Encoder Nano BEIR
126
+
127
+ * Dataset: `NanoBEIR_mean`
128
+ * Evaluated with [<code>CENanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CENanoBEIREvaluator)
129
+
130
+ | Metric | Value |
131
+ |:------------|:---------------------|
132
+ | map | 0.4518 (+0.0582) |
133
+ | mrr@10 | 0.5090 (+0.0410) |
134
+ | **ndcg@10** | **0.5048 (+0.0494)** |
135
+
136
+ <!--
137
+ ## Bias, Risks and Limitations
138
+
139
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
140
+ -->
141
+
142
+ <!--
143
+ ### Recommendations
144
+
145
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
146
+ -->
147
+
148
+ ## Training Details
149
+
150
+ ### Training Dataset
151
+
152
+ #### ms_marco
153
+
154
+ * Dataset: [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) at [a47ee7a](https://huggingface.co/datasets/microsoft/ms_marco/tree/a47ee7aae8d7d466ba15f9f0bfac3b3681087b3a)
155
+ * Size: 82,326 training samples
156
+ * Columns: <code>query</code>, <code>docs</code>, and <code>labels</code>
157
+ * Approximate statistics based on the first 1000 samples:
158
+ | | query | docs | labels |
159
+ |:--------|:-----------------------------------------------------------------------------------------------|:------------------------------------|:------------------------------------|
160
+ | type | string | list | list |
161
+ | details | <ul><li>min: 11 characters</li><li>mean: 34.16 characters</li><li>max: 96 characters</li></ul> | <ul><li>size: 10 elements</li></ul> | <ul><li>size: 10 elements</li></ul> |
162
+ * Samples:
163
+ | query | docs | labels |
164
+ |:----------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
165
+ | <code>what does a bursa do</code> | <code>['Bursae (plural for bursa) are flattened fluid-filled sacs that function as cushions between your bones and the muscles (deep bursae) or bones and tendons (superficial bursae). Your bursae play an important role in leading a healthy, active life. When the bursae are not irritated and working properly, your joints move smoothly and painlessly. However, when a bursa becomes swollen and inflamed, the condition is known as bursitis.', 'A bursa is a small, fluid-filled sac that acts as a cushion between a bone and other moving parts: muscles, tendons, or skin. Bursae are found throughout the body. Bursitis occurs when a bursa becomes inflamed (redness and increased fluid in the bursa). A tendon is a flexible band of fibrous tissue that connects muscles to bones. Tendinitis is inflammation of a tendon. Tendons transmit the pull of the muscle to the bone to cause movement.', 'A bursa (plural bursae or bursas) is a small fluid-filled sac lined by synovial membrane with an inner capillary laye...</code> | <code>[1, 1, 0, 0, 0, ...]</code> |
166
+ | <code>what is gluten in</code> | <code>['Gluten is a general name for the proteins found in wheat (durum, emmer, spelt, farina, farro, KAMUT® khorasan wheat and einkorn), rye, barley and triticale. Gluten helps foods maintain their shape, acting as a glue that holds food together.', 'Definition. A gluten-free diet is a diet that excludes the protein gluten. Gluten is found in grains such as wheat, barley, rye, and a cross between wheat and rye called triticale. A gluten-free diet is primarily used to treat celiac disease. Gluten causes inflammation in the small intestines of people with', 'A gluten-free diet is a diet that excludes the protein gluten. Gluten is found in grains such as wheat, barley, rye, and a cross between wheat and rye called triticale. A gluten-free diet is primarily used to treat celiac disease. Gluten causes inflammation in the small intestines of people with', 'Gluten is found in wheat, rye, barley and any foods made with these grains. Avoiding wheat can be especially hard because this means you shoul...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
167
+ | <code>what is a payaway</code> | <code>['Playaway is the name of a solid-state prerecorded audio player introduced in 2005 by Findaway World, LLC, based in Solon, Ohio. About the size of a deck of playing cards and weighing 2 ounces, it can store up to 80 hours of audio. As of March 2010, the audiobooks are all produced in high definition audio. The digital content (audiobook or music compilation) is preloaded at the factory and cannot be changed or copied by the end user. A 3.5 mm stereo jack provides output to earphones or an external amplifier. Playaway was specifically designed to use most commonly available cassette adaptors and FM transmitters. Power is provided by a changeable 1.5V AAA cell, which the manufacturer claims allows it to operate approximately 20 hours before battery depletion, 30 hours for the newer versions.', "Playaway Audio Books. Playaway® is the simplest way to listen to audio on the go. Each Playaway is a self-contained audiobook, weighs only two ounces, and comes with a battery. Just plug in your ...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
168
+ * Loss: [<code>ListNetLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#listnetloss) with these parameters:
169
+ ```json
170
+ {
171
+ "eps": 1e-10,
172
+ "pad_value": -1
173
+ }
174
+ ```
175
+
176
+ ### Evaluation Dataset
177
+
178
+ #### ms_marco
179
+
180
+ * Dataset: [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) at [a47ee7a](https://huggingface.co/datasets/microsoft/ms_marco/tree/a47ee7aae8d7d466ba15f9f0bfac3b3681087b3a)
181
+ * Size: 82,326 evaluation samples
182
+ * Columns: <code>query</code>, <code>docs</code>, and <code>labels</code>
183
+ * Approximate statistics based on the first 1000 samples:
184
+ | | query | docs | labels |
185
+ |:--------|:-----------------------------------------------------------------------------------------------|:------------------------------------|:------------------------------------|
186
+ | type | string | list | list |
187
+ | details | <ul><li>min: 10 characters</li><li>mean: 34.53 characters</li><li>max: 93 characters</li></ul> | <ul><li>size: 10 elements</li></ul> | <ul><li>size: 10 elements</li></ul> |
188
+ * Samples:
189
+ | query | docs | labels |
190
+ |:----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
191
+ | <code>how long does ehic card take to arrive</code> | <code>['The quickest way is to apply online. Your EHIC will normally arrive within seven days and will usually be valid for five years. This means at a reduced cost or sometimes free of charge. Even with an EHIC, you may have to pay towards your treatment, depending on the rules of the country you’re visiting. You may be able to claim the money back – always try to apply for a refund before you return home. Find out how to do this in the country-by-country guide for the EHIC', "What is the EHIC? The European Health Insurance Card or EHIC was introduced in 2004 across the European Union. It allows Irish residents to access health services in any EU country and in Switzerland, Iceland, Liechtenstein and Norway, if they become ill or injured while on a temporary stay in that country. No. Your card will be valid for 4 to 5 years. Check that you and your family's cards are valid before you travel, and if they have expired, it's easy to renew them online at www.ehic.ie or at your Local Health Offi...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
192
+ | <code>what are the muscles that dorsiflex the foot</code> | <code>['Dorsiflexion of the foot uses four muscles. These are the tibialis anterior, extensor digitorum longus, extensor hallucis longus, and the peroneus tertius. ', 'There are four muscles in the anterior compartment of the leg; tibialis anterior, extensor digitorun longus, extensor hallucis longus and fibularis tertius. Collectively, they act to dorsiflex and invert the foot at the ankle joint. The extensor digitorum longus and extensor hallucis longus also extend the toes. The muscles in this compartment are innervated by the deep fibular nerve (L4-L5), and blood is supplied via the anterior tibial artery.', 'Many muscles do the work of moving the ankle and foot. Some of the muscles that move the foot start higher up in the leg, and smaller muscles work right in the foot itself. The leg is divided into compartments: the anterior, lateral, and posterior compartments. The muscles in these compartments help move the ankle and the foot: Anterior compartment: This compartment lies in front ...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
193
+ | <code>What does the thailand flag mean</code> | <code>["Thai Flag Meaning: The red stripes mean Thailand's nation. The white stands for the country's main religion. Blue is Thailand's national color and it represents the Thai monarchy/Royal Family. Hoped This helped Brought to you by: Firescream66 and the post starter. B … lue is Thailand's national color and it represents the Thai monarchy. The blue is also used to honor Thailand's World War I allies, Great Britain, France, United States and Russia, who all had red, white and blue flags.", "The red represents the blood spilled to maintain Thailand's independence. The white stands for purity and is the color of Buddhism. And the Blue represents the Thai monarchy. The pattern repeats so that the flag can be flown without ever appearing upside down. The old Siam (former name of Thailand) flag was a solid red with a white elephant in the middle.", 'The flag of the Kingdom of Thailand (Thai: ธงไตรรงค์, Thong Trairong, meaning tricolour flag”) shows five horizontal stripes in the colours red,...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
194
+ * Loss: [<code>ListNetLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#listnetloss) with these parameters:
195
+ ```json
196
+ {
197
+ "eps": 1e-10,
198
+ "pad_value": -1
199
+ }
200
+ ```
201
+
202
+ ### Training Hyperparameters
203
+ #### Non-Default Hyperparameters
204
+
205
+ - `eval_strategy`: steps
206
+ - `per_device_train_batch_size`: 6
207
+ - `per_device_eval_batch_size`: 16
208
+ - `torch_empty_cache_steps`: 2000
209
+ - `learning_rate`: 4e-06
210
+ - `warmup_ratio`: 0.1
211
+ - `seed`: 12
212
+ - `bf16`: True
213
+ - `load_best_model_at_end`: True
214
+
215
+ #### All Hyperparameters
216
+ <details><summary>Click to expand</summary>
217
+
218
+ - `overwrite_output_dir`: False
219
+ - `do_predict`: False
220
+ - `eval_strategy`: steps
221
+ - `prediction_loss_only`: True
222
+ - `per_device_train_batch_size`: 6
223
+ - `per_device_eval_batch_size`: 16
224
+ - `per_gpu_train_batch_size`: None
225
+ - `per_gpu_eval_batch_size`: None
226
+ - `gradient_accumulation_steps`: 1
227
+ - `eval_accumulation_steps`: None
228
+ - `torch_empty_cache_steps`: 2000
229
+ - `learning_rate`: 4e-06
230
+ - `weight_decay`: 0.0
231
+ - `adam_beta1`: 0.9
232
+ - `adam_beta2`: 0.999
233
+ - `adam_epsilon`: 1e-08
234
+ - `max_grad_norm`: 1.0
235
+ - `num_train_epochs`: 3
236
+ - `max_steps`: -1
237
+ - `lr_scheduler_type`: linear
238
+ - `lr_scheduler_kwargs`: {}
239
+ - `warmup_ratio`: 0.1
240
+ - `warmup_steps`: 0
241
+ - `log_level`: passive
242
+ - `log_level_replica`: warning
243
+ - `log_on_each_node`: True
244
+ - `logging_nan_inf_filter`: True
245
+ - `save_safetensors`: True
246
+ - `save_on_each_node`: False
247
+ - `save_only_model`: False
248
+ - `restore_callback_states_from_checkpoint`: False
249
+ - `no_cuda`: False
250
+ - `use_cpu`: False
251
+ - `use_mps_device`: False
252
+ - `seed`: 12
253
+ - `data_seed`: None
254
+ - `jit_mode_eval`: False
255
+ - `use_ipex`: False
256
+ - `bf16`: True
257
+ - `fp16`: False
258
+ - `fp16_opt_level`: O1
259
+ - `half_precision_backend`: auto
260
+ - `bf16_full_eval`: False
261
+ - `fp16_full_eval`: False
262
+ - `tf32`: None
263
+ - `local_rank`: 0
264
+ - `ddp_backend`: None
265
+ - `tpu_num_cores`: None
266
+ - `tpu_metrics_debug`: False
267
+ - `debug`: []
268
+ - `dataloader_drop_last`: False
269
+ - `dataloader_num_workers`: 0
270
+ - `dataloader_prefetch_factor`: None
271
+ - `past_index`: -1
272
+ - `disable_tqdm`: False
273
+ - `remove_unused_columns`: True
274
+ - `label_names`: None
275
+ - `load_best_model_at_end`: True
276
+ - `ignore_data_skip`: False
277
+ - `fsdp`: []
278
+ - `fsdp_min_num_params`: 0
279
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
280
+ - `fsdp_transformer_layer_cls_to_wrap`: None
281
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
282
+ - `deepspeed`: None
283
+ - `label_smoothing_factor`: 0.0
284
+ - `optim`: adamw_torch
285
+ - `optim_args`: None
286
+ - `adafactor`: False
287
+ - `group_by_length`: False
288
+ - `length_column_name`: length
289
+ - `ddp_find_unused_parameters`: None
290
+ - `ddp_bucket_cap_mb`: None
291
+ - `ddp_broadcast_buffers`: False
292
+ - `dataloader_pin_memory`: True
293
+ - `dataloader_persistent_workers`: False
294
+ - `skip_memory_metrics`: True
295
+ - `use_legacy_prediction_loop`: False
296
+ - `push_to_hub`: False
297
+ - `resume_from_checkpoint`: None
298
+ - `hub_model_id`: None
299
+ - `hub_strategy`: every_save
300
+ - `hub_private_repo`: None
301
+ - `hub_always_push`: False
302
+ - `gradient_checkpointing`: False
303
+ - `gradient_checkpointing_kwargs`: None
304
+ - `include_inputs_for_metrics`: False
305
+ - `include_for_metrics`: []
306
+ - `eval_do_concat_batches`: True
307
+ - `fp16_backend`: auto
308
+ - `push_to_hub_model_id`: None
309
+ - `push_to_hub_organization`: None
310
+ - `mp_parameters`:
311
+ - `auto_find_batch_size`: False
312
+ - `full_determinism`: False
313
+ - `torchdynamo`: None
314
+ - `ray_scope`: last
315
+ - `ddp_timeout`: 1800
316
+ - `torch_compile`: False
317
+ - `torch_compile_backend`: None
318
+ - `torch_compile_mode`: None
319
+ - `dispatch_batches`: None
320
+ - `split_batches`: None
321
+ - `include_tokens_per_second`: False
322
+ - `include_num_input_tokens_seen`: False
323
+ - `neftune_noise_alpha`: None
324
+ - `optim_target_modules`: None
325
+ - `batch_eval_metrics`: False
326
+ - `eval_on_start`: False
327
+ - `use_liger_kernel`: False
328
+ - `eval_use_gather_object`: False
329
+ - `average_tokens_across_devices`: False
330
+ - `prompts`: None
331
+ - `batch_sampler`: batch_sampler
332
+ - `multi_dataset_batch_sampler`: proportional
333
+
334
+ </details>
335
+
336
+ ### Training Logs
337
+ | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_ndcg@10 | NanoNFCorpus_ndcg@10 | NanoNQ_ndcg@10 | NanoBEIR_mean_ndcg@10 |
338
+ |:----------:|:---------:|:-------------:|:---------------:|:--------------------:|:--------------------:|:--------------------:|:---------------------:|
339
+ | -1 | -1 | - | - | 0.0264 (-0.5140) | 0.2585 (-0.0665) | 0.0606 (-0.4401) | 0.1152 (-0.3402) |
340
+ | 0.0001 | 1 | 1.8458 | - | - | - | - | - |
341
+ | 0.0430 | 500 | 2.1043 | - | - | - | - | - |
342
+ | 0.0861 | 1000 | 2.0906 | - | - | - | - | - |
343
+ | 0.1291 | 1500 | 2.0873 | - | - | - | - | - |
344
+ | 0.1721 | 2000 | 2.0848 | 2.0847 | 0.0655 (-0.4749) | 0.2309 (-0.0941) | 0.1177 (-0.3830) | 0.1380 (-0.3174) |
345
+ | 0.2152 | 2500 | 2.0864 | - | - | - | - | - |
346
+ | 0.2582 | 3000 | 2.0884 | - | - | - | - | - |
347
+ | 0.3013 | 3500 | 2.0783 | - | - | - | - | - |
348
+ | 0.3443 | 4000 | 2.0792 | 2.0791 | 0.3223 (-0.2181) | 0.3229 (-0.0021) | 0.2919 (-0.2088) | 0.3124 (-0.1430) |
349
+ | 0.3873 | 4500 | 2.0817 | - | - | - | - | - |
350
+ | 0.4304 | 5000 | 2.0828 | - | - | - | - | - |
351
+ | 0.4734 | 5500 | 2.0785 | - | - | - | - | - |
352
+ | 0.5164 | 6000 | 2.0751 | 2.0740 | 0.4743 (-0.0661) | 0.3450 (+0.0200) | 0.5233 (+0.0226) | 0.4475 (-0.0078) |
353
+ | 0.5595 | 6500 | 2.0719 | - | - | - | - | - |
354
+ | 0.6025 | 7000 | 2.0726 | - | - | - | - | - |
355
+ | 0.6456 | 7500 | 2.0734 | - | - | - | - | - |
356
+ | 0.6886 | 8000 | 2.0769 | 2.0722 | 0.5006 (-0.0398) | 0.3449 (+0.0198) | 0.4920 (-0.0087) | 0.4458 (-0.0095) |
357
+ | 0.7316 | 8500 | 2.0722 | - | - | - | - | - |
358
+ | 0.7747 | 9000 | 2.0669 | - | - | - | - | - |
359
+ | 0.8177 | 9500 | 2.0787 | - | - | - | - | - |
360
+ | 0.8607 | 10000 | 2.0661 | 2.0710 | 0.5646 (+0.0242) | 0.3363 (+0.0113) | 0.5672 (+0.0666) | 0.4894 (+0.0340) |
361
+ | 0.9038 | 10500 | 2.0754 | - | - | - | - | - |
362
+ | 0.9468 | 11000 | 2.0717 | - | - | - | - | - |
363
+ | 0.9898 | 11500 | 2.0779 | - | - | - | - | - |
364
+ | 1.0329 | 12000 | 2.0703 | 2.0706 | 0.5609 (+0.0205) | 0.3107 (-0.0144) | 0.5817 (+0.0811) | 0.4844 (+0.0291) |
365
+ | 1.0759 | 12500 | 2.0692 | - | - | - | - | - |
366
+ | 1.1190 | 13000 | 2.0665 | - | - | - | - | - |
367
+ | 1.1620 | 13500 | 2.0801 | - | - | - | - | - |
368
+ | 1.2050 | 14000 | 2.0723 | 2.0702 | 0.5413 (+0.0009) | 0.3249 (-0.0001) | 0.5961 (+0.0954) | 0.4874 (+0.0321) |
369
+ | 1.2481 | 14500 | 2.0707 | - | - | - | - | - |
370
+ | 1.2911 | 15000 | 2.0715 | - | - | - | - | - |
371
+ | 1.3341 | 15500 | 2.0664 | - | - | - | - | - |
372
+ | 1.3772 | 16000 | 2.0736 | 2.0700 | 0.5234 (-0.0171) | 0.3314 (+0.0064) | 0.6068 (+0.1061) | 0.4872 (+0.0318) |
373
+ | 1.4202 | 16500 | 2.0733 | - | - | - | - | - |
374
+ | 1.4632 | 17000 | 2.0728 | - | - | - | - | - |
375
+ | 1.5063 | 17500 | 2.068 | - | - | - | - | - |
376
+ | **1.5493** | **18000** | **2.0669** | **2.0699** | **0.5335 (-0.0069)** | **0.3530 (+0.0280)** | **0.6278 (+0.1272)** | **0.5048 (+0.0494)** |
377
+ | 1.5924 | 18500 | 2.0713 | - | - | - | - | - |
378
+ | 1.6354 | 19000 | 2.0689 | - | - | - | - | - |
379
+ | 1.6784 | 19500 | 2.07 | - | - | - | - | - |
380
+ | 1.7215 | 20000 | 2.0731 | 2.0696 | 0.5365 (-0.0039) | 0.3497 (+0.0247) | 0.5845 (+0.0838) | 0.4902 (+0.0349) |
381
+ | 1.7645 | 20500 | 2.0678 | - | - | - | - | - |
382
+ | 1.8075 | 21000 | 2.0646 | - | - | - | - | - |
383
+ | 1.8506 | 21500 | 2.0631 | - | - | - | - | - |
384
+ | 1.8936 | 22000 | 2.0714 | 2.0694 | 0.5340 (-0.0064) | 0.3490 (+0.0239) | 0.5653 (+0.0646) | 0.4828 (+0.0274) |
385
+ | 1.9367 | 22500 | 2.059 | - | - | - | - | - |
386
+ | 1.9797 | 23000 | 2.068 | - | - | - | - | - |
387
+ | 2.0227 | 23500 | 2.0664 | - | - | - | - | - |
388
+ | 2.0658 | 24000 | 2.0719 | 2.0699 | 0.5442 (+0.0038) | 0.3531 (+0.0281) | 0.5879 (+0.0873) | 0.4951 (+0.0397) |
389
+ | 2.1088 | 24500 | 2.0621 | - | - | - | - | - |
390
+ | 2.1518 | 25000 | 2.0669 | - | - | - | - | - |
391
+ | 2.1949 | 25500 | 2.067 | - | - | - | - | - |
392
+ | 2.2379 | 26000 | 2.0676 | 2.0700 | 0.5449 (+0.0044) | 0.3334 (+0.0084) | 0.5656 (+0.0649) | 0.4813 (+0.0259) |
393
+ | 2.2809 | 26500 | 2.0621 | - | - | - | - | - |
394
+ | 2.3240 | 27000 | 2.0634 | - | - | - | - | - |
395
+ | 2.3670 | 27500 | 2.065 | - | - | - | - | - |
396
+ | 2.4101 | 28000 | 2.0669 | 2.0704 | 0.5128 (-0.0276) | 0.3495 (+0.0244) | 0.5751 (+0.0744) | 0.4791 (+0.0237) |
397
+ | 2.4531 | 28500 | 2.0636 | - | - | - | - | - |
398
+ | 2.4961 | 29000 | 2.0623 | - | - | - | - | - |
399
+ | 2.5392 | 29500 | 2.0669 | - | - | - | - | - |
400
+ | 2.5822 | 30000 | 2.0615 | 2.0698 | 0.5448 (+0.0044) | 0.3406 (+0.0156) | 0.5768 (+0.0762) | 0.4874 (+0.0321) |
401
+ | 2.6252 | 30500 | 2.0708 | - | - | - | - | - |
402
+ | 2.6683 | 31000 | 2.0655 | - | - | - | - | - |
403
+ | 2.7113 | 31500 | 2.0511 | - | - | - | - | - |
404
+ | 2.7543 | 32000 | 2.0623 | 2.0699 | 0.5377 (-0.0027) | 0.3505 (+0.0255) | 0.5854 (+0.0847) | 0.4912 (+0.0358) |
405
+ | 2.7974 | 32500 | 2.0651 | - | - | - | - | - |
406
+ | 2.8404 | 33000 | 2.0675 | - | - | - | - | - |
407
+ | 2.8835 | 33500 | 2.0689 | - | - | - | - | - |
408
+ | 2.9265 | 34000 | 2.067 | 2.0699 | 0.5221 (-0.0184) | 0.3605 (+0.0354) | 0.5695 (+0.0688) | 0.4840 (+0.0286) |
409
+ | 2.9695 | 34500 | 2.0634 | - | - | - | - | - |
410
+ | -1 | -1 | - | - | 0.5335 (-0.0069) | 0.3530 (+0.0280) | 0.6278 (+0.1272) | 0.5048 (+0.0494) |
411
+
412
+ * The bold row denotes the saved checkpoint.
413
+
414
+ ### Framework Versions
415
+ - Python: 3.10.13
416
+ - Sentence Transformers: 3.5.0.dev0
417
+ - Transformers: 4.48.1
418
+ - PyTorch: 2.5.1+cu124
419
+ - Accelerate: 1.3.0
420
+ - Datasets: 3.2.0
421
+ - Tokenizers: 0.21.0
422
+
423
+ ## Citation
424
+
425
+ ### BibTeX
426
+
427
+ #### Sentence Transformers
428
+ ```bibtex
429
+ @inproceedings{reimers-2019-sentence-bert,
430
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
431
+ author = "Reimers, Nils and Gurevych, Iryna",
432
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
433
+ month = "11",
434
+ year = "2019",
435
+ publisher = "Association for Computational Linguistics",
436
+ url = "https://arxiv.org/abs/1908.10084",
437
+ }
438
+ ```
439
+
440
+ #### ListNetLoss
441
+ ```bibtex
442
+ @inproceedings{cao2007learning,
443
+ title={Learning to rank: from pairwise approach to listwise approach},
444
+ author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
445
+ booktitle={Proceedings of the 24th international conference on Machine learning},
446
+ pages={129--136},
447
+ year={2007}
448
+ }
449
+ ```
450
+
451
+ <!--
452
+ ## Glossary
453
+
454
+ *Clearly define terms in order to be accessible across audiences.*
455
+ -->
456
+
457
+ <!--
458
+ ## Model Card Authors
459
+
460
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
461
+ -->
462
+
463
+ <!--
464
+ ## Model Card Contact
465
+
466
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
467
+ -->
config.json ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "answerdotai/ModernBERT-base",
3
+ "architectures": [
4
+ "ModernBertForSequenceClassification"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "id2label": {
24
+ "0": "LABEL_0"
25
+ },
26
+ "initializer_cutoff_factor": 2.0,
27
+ "initializer_range": 0.02,
28
+ "intermediate_size": 1152,
29
+ "label2id": {
30
+ "LABEL_0": 0
31
+ },
32
+ "layer_norm_eps": 1e-05,
33
+ "local_attention": 128,
34
+ "local_rope_theta": 10000.0,
35
+ "max_position_embeddings": 8192,
36
+ "mlp_bias": false,
37
+ "mlp_dropout": 0.0,
38
+ "model_type": "modernbert",
39
+ "norm_bias": false,
40
+ "norm_eps": 1e-05,
41
+ "num_attention_heads": 12,
42
+ "num_hidden_layers": 22,
43
+ "pad_token_id": 50283,
44
+ "position_embedding_type": "absolute",
45
+ "reference_compile": true,
46
+ "repad_logits_with_grad": false,
47
+ "sep_token_id": 50282,
48
+ "sparse_pred_ignore_index": -100,
49
+ "sparse_prediction": false,
50
+ "torch_dtype": "float32",
51
+ "transformers_version": "4.48.1",
52
+ "vocab_size": 50368
53
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a43f54e2a3f80bf5d7d2362dddabaa555b588cbb3cc30078aa7b824cc14f3352
3
+ size 598436708
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }