bakrianoo commited on
Commit
9eb5073
·
verified ·
1 Parent(s): 509a077

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1069 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: aubmindlab/bert-base-arabertv02
3
+ library_name: sentence-transformers
4
+ metrics:
5
+ - pearson_cosine
6
+ - spearman_cosine
7
+ - pearson_manhattan
8
+ - spearman_manhattan
9
+ - pearson_euclidean
10
+ - spearman_euclidean
11
+ - pearson_dot
12
+ - spearman_dot
13
+ - pearson_max
14
+ - spearman_max
15
+ pipeline_tag: sentence-similarity
16
+ tags:
17
+ - sentence-transformers
18
+ - sentence-similarity
19
+ - feature-extraction
20
+ - generated_from_trainer
21
+ - dataset_size:2279719
22
+ - loss:MatryoshkaLoss
23
+ - loss:MultipleNegativesRankingLoss
24
+ widget:
25
+ - source_sentence: ما هو علاج الفطريات الجلدية؟
26
+ sentences:
27
+ - كيف سيؤثر ذلك على الطلاب الهنود الذين يدرسون أو يعملون في الولايات المتحدة إذا
28
+ أصبح ترامب رئيساً؟
29
+ - كيف يمكنك معالجة الأكزيما بشكل طبيعي؟
30
+ - كيف تعالج الفطريات الجلدية؟
31
+ - source_sentence: 'So Eric had an initial design idea for a robot, but we didn''t
32
+ have all the parts figured out, so we did what anybody would do in our situation:
33
+ we asked the Internet for help.'
34
+ sentences:
35
+ - وهكذا أول شيء فعلناه هو , بمجرد أن التسلسل خرج من الماكينات , نشرناه على الإنترنت
36
+ .
37
+ - وكانت لدى "إريك" فكرة مبدئية لصناعة روبوت، ولكن لم يكن لدينا فكرة عن القطع التي
38
+ نحتاجها لذلك قمنا بما يمكن أن يقوم به أي شخص بوضعنا قمنا بطلب المساعدة عبر الإنترنت
39
+ - ما هي مواقع الويب التي يجب اتباعها لتوصيات الأسهم خلال اليوم في سوق الأسهم الهندية؟
40
+ - source_sentence: Well, guess what? In England, it's seven per 100,000.
41
+ sentences:
42
+ - عندما نكون أطفالًا، نتعلم الضحك، ونتعلم الضحك بشكل أساسي في اللعب.
43
+ - هذا ليس 10000 دولارا، إنه بالعملة المحلية .
44
+ - خمنوا ماذا؟ في إنكلترا، النسبة سبع في كل 000 100.
45
+ - source_sentence: ما هي العوامل الحيوية وغير الحيوية؟ كيف تختلف عن بعضها البعض؟
46
+ sentences:
47
+ - ما هي بعض النصائح لتعلم لغة بايثون؟
48
+ - كما تم تسجيل نتائج إيجابية لثلاثة أيام متتالية.
49
+ - كيف تقارن العوامل الحيوية والعوامل غير الحيوية وتتناقض؟
50
+ - source_sentence: And the piece of art he bought at the yard sale is hanging in his
51
+ classroom; he's a teacher now.
52
+ sentences:
53
+ - هل الرياضيات لغة أخرى؟
54
+ - تدريجيا، أصبحت هذه العصافير بمثابة معلمين له.
55
+ - أما اللوحات التي أشتراها منّي فهي معلّقة الآن في غرفة الصف خاصّته؛ فقد أصبح مدرّساً.
56
+ model-index:
57
+ - name: SentenceTransformer based on aubmindlab/bert-base-arabertv02
58
+ results:
59
+ - task:
60
+ type: semantic-similarity
61
+ name: Semantic Similarity
62
+ dataset:
63
+ name: sts dev 768
64
+ type: sts-dev-768
65
+ metrics:
66
+ - type: pearson_cosine
67
+ value: 0.8410341962006318
68
+ name: Pearson Cosine
69
+ - type: spearman_cosine
70
+ value: 0.8422963798504417
71
+ name: Spearman Cosine
72
+ - type: pearson_manhattan
73
+ value: 0.8119358373898954
74
+ name: Pearson Manhattan
75
+ - type: spearman_manhattan
76
+ value: 0.8260328397910858
77
+ name: Spearman Manhattan
78
+ - type: pearson_euclidean
79
+ value: 0.8138598024349573
80
+ name: Pearson Euclidean
81
+ - type: spearman_euclidean
82
+ value: 0.831707795171752
83
+ name: Spearman Euclidean
84
+ - type: pearson_dot
85
+ value: 0.8371709698109359
86
+ name: Pearson Dot
87
+ - type: spearman_dot
88
+ value: 0.8389681969788781
89
+ name: Spearman Dot
90
+ - type: pearson_max
91
+ value: 0.8410341962006318
92
+ name: Pearson Max
93
+ - type: spearman_max
94
+ value: 0.8422963798504417
95
+ name: Spearman Max
96
+ - task:
97
+ type: semantic-similarity
98
+ name: Semantic Similarity
99
+ dataset:
100
+ name: sts dev 512
101
+ type: sts-dev-512
102
+ metrics:
103
+ - type: pearson_cosine
104
+ value: 0.8408199016320912
105
+ name: Pearson Cosine
106
+ - type: spearman_cosine
107
+ value: 0.8415754271206667
108
+ name: Spearman Cosine
109
+ - type: pearson_manhattan
110
+ value: 0.8114852653680014
111
+ name: Pearson Manhattan
112
+ - type: spearman_manhattan
113
+ value: 0.8231951698466913
114
+ name: Spearman Manhattan
115
+ - type: pearson_euclidean
116
+ value: 0.8125911836775428
117
+ name: Pearson Euclidean
118
+ - type: spearman_euclidean
119
+ value: 0.8267107276111355
120
+ name: Spearman Euclidean
121
+ - type: pearson_dot
122
+ value: 0.8357223021732401
123
+ name: Pearson Dot
124
+ - type: spearman_dot
125
+ value: 0.8377004761329118
126
+ name: Spearman Dot
127
+ - type: pearson_max
128
+ value: 0.8408199016320912
129
+ name: Pearson Max
130
+ - type: spearman_max
131
+ value: 0.8415754271206667
132
+ name: Spearman Max
133
+ ---
134
+
135
+ # SentenceTransformer based on aubmindlab/bert-base-arabertv02
136
+
137
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
138
+
139
+ ## Model Details
140
+
141
+ ### Model Description
142
+ - **Model Type:** Sentence Transformer
143
+ - **Base model:** [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) <!-- at revision 016fb9d6768f522a59c6e0d2d5d5d43a4e1bff60 -->
144
+ - **Maximum Sequence Length:** 512 tokens
145
+ - **Output Dimensionality:** 768 tokens
146
+ - **Similarity Function:** Cosine Similarity
147
+ <!-- - **Training Dataset:** Unknown -->
148
+ <!-- - **Language:** Unknown -->
149
+ <!-- - **License:** Unknown -->
150
+
151
+ ### Model Sources
152
+
153
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
154
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
155
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
156
+
157
+ ### Full Model Architecture
158
+
159
+ ```
160
+ SentenceTransformer(
161
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
162
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
163
+ )
164
+ ```
165
+
166
+ ## Usage
167
+
168
+ ### Direct Usage (Sentence Transformers)
169
+
170
+ First install the Sentence Transformers library:
171
+
172
+ ```bash
173
+ pip install -U sentence-transformers
174
+ ```
175
+
176
+ Then you can load this model and run inference.
177
+ ```python
178
+ from sentence_transformers import SentenceTransformer
179
+
180
+ # Download from the 🤗 Hub
181
+ model = SentenceTransformer("silma-ai/silma-embeddding-matryoshka-0.1")
182
+ # Run inference
183
+ sentences = [
184
+ "And the piece of art he bought at the yard sale is hanging in his classroom; he's a teacher now.",
185
+ 'أما اللوحات التي أشتراها منّي فهي معلّقة الآن في غرفة الصف خاصّته؛ فقد أصبح مدرّساً.',
186
+ 'تدريجيا، أصبحت هذه العصافير بمثابة معلمين له.',
187
+ ]
188
+ embeddings = model.encode(sentences)
189
+ print(embeddings.shape)
190
+ # [3, 768]
191
+
192
+ # Get the similarity scores for the embeddings
193
+ similarities = model.similarity(embeddings, embeddings)
194
+ print(similarities.shape)
195
+ # [3, 3]
196
+ ```
197
+
198
+ <!--
199
+ ### Direct Usage (Transformers)
200
+
201
+ <details><summary>Click to see the direct usage in Transformers</summary>
202
+
203
+ </details>
204
+ -->
205
+
206
+ <!--
207
+ ### Downstream Usage (Sentence Transformers)
208
+
209
+ You can finetune this model on your own dataset.
210
+
211
+ <details><summary>Click to expand</summary>
212
+
213
+ </details>
214
+ -->
215
+
216
+ <!--
217
+ ### Out-of-Scope Use
218
+
219
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
220
+ -->
221
+
222
+ ## Evaluation
223
+
224
+ ### Metrics
225
+
226
+ #### Semantic Similarity
227
+ * Dataset: `sts-dev-768`
228
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
229
+
230
+ | Metric | Value |
231
+ |:--------------------|:-----------|
232
+ | pearson_cosine | 0.841 |
233
+ | **spearman_cosine** | **0.8423** |
234
+ | pearson_manhattan | 0.8119 |
235
+ | spearman_manhattan | 0.826 |
236
+ | pearson_euclidean | 0.8139 |
237
+ | spearman_euclidean | 0.8317 |
238
+ | pearson_dot | 0.8372 |
239
+ | spearman_dot | 0.839 |
240
+ | pearson_max | 0.841 |
241
+ | spearman_max | 0.8423 |
242
+
243
+ #### Semantic Similarity
244
+ * Dataset: `sts-dev-512`
245
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
246
+
247
+ | Metric | Value |
248
+ |:--------------------|:-----------|
249
+ | pearson_cosine | 0.8408 |
250
+ | **spearman_cosine** | **0.8416** |
251
+ | pearson_manhattan | 0.8115 |
252
+ | spearman_manhattan | 0.8232 |
253
+ | pearson_euclidean | 0.8126 |
254
+ | spearman_euclidean | 0.8267 |
255
+ | pearson_dot | 0.8357 |
256
+ | spearman_dot | 0.8377 |
257
+ | pearson_max | 0.8408 |
258
+ | spearman_max | 0.8416 |
259
+
260
+ <!--
261
+ ## Bias, Risks and Limitations
262
+
263
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
264
+ -->
265
+
266
+ <!--
267
+ ### Recommendations
268
+
269
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
270
+ -->
271
+
272
+ ## Training Details
273
+
274
+ ### Training Dataset
275
+
276
+ #### Unnamed Dataset
277
+
278
+
279
+ * Size: 2,279,719 training samples
280
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
281
+ * Approximate statistics based on the first 1000 samples:
282
+ | | anchor | positive | negative |
283
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
284
+ | type | string | string | string |
285
+ | details | <ul><li>min: 4 tokens</li><li>mean: 19.51 tokens</li><li>max: 139 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 12.47 tokens</li><li>max: 59 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 12.13 tokens</li><li>max: 72 tokens</li></ul> |
286
+ * Samples:
287
+ | anchor | positive | negative |
288
+ |:-------------------------------------------------------------------|:------------------------------------------------|:--------------------------------------------------------|
289
+ | <code>كيف أصنع صاروخاً؟</code> | <code>كيف أصنع صاروخاً صناعياً؟</code> | <code>كيف أصنع أول روبوت لي؟</code> |
290
+ | <code>فتاة شابة تجلس على طاولة مع وعاء على رأسها</code> | <code>فتاة صغيرة لديها وعاء على رأسها</code> | <code>رجل يأكل الحبوب في سيارته</code> |
291
+ | <code>كيف يمكنني الانضمام إلى الجيش الهندي بعد البكالوريوس؟</code> | <code>كيف تنضم للجيش الهندي بعد الهندسة؟</code> | <code>كيف لي أن أعرف ماذا أريد أن أفعل في حياتي؟</code> |
292
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
293
+ ```json
294
+ {
295
+ "loss": "MultipleNegativesRankingLoss",
296
+ "matryoshka_dims": [
297
+ 768,
298
+ 512
299
+ ],
300
+ "matryoshka_weights": [
301
+ 1,
302
+ 1
303
+ ],
304
+ "n_dims_per_step": -1
305
+ }
306
+ ```
307
+
308
+ ### Evaluation Dataset
309
+
310
+ #### Unnamed Dataset
311
+
312
+
313
+ * Size: 600 evaluation samples
314
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
315
+ * Approximate statistics based on the first 600 samples:
316
+ | | anchor | positive | negative |
317
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
318
+ | type | string | string | string |
319
+ | details | <ul><li>min: 4 tokens</li><li>mean: 19.5 tokens</li><li>max: 146 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 12.67 tokens</li><li>max: 43 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 12.15 tokens</li><li>max: 41 tokens</li></ul> |
320
+ * Samples:
321
+ | anchor | positive | negative |
322
+ |:-------------------------------------------------------------|:------------------------------------------------|:-----------------------------------------------------------------|
323
+ | <code>And this explanation represents great progress.</code> | <code>وهذا التفسير يمثل تقدماً عظيماً</code> | <code>وأظهرت هذا الإتجاه المذهل.</code> |
324
+ | <code>ثلاثة رجال يلعبون كرة السلة</code> | <code>ثلاثة رجال يلعبون لعبة كرة السلة</code> | <code>رجلين يرتديان ملابس غريبة يقفزان على ملعب كرة السلة</code> |
325
+ | <code>الرجل جالس</code> | <code>رجل يرتدي قميصاً أحمر يعزف الطبول.</code> | <code>رجل في قميص رمادي يقف.</code> |
326
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
327
+ ```json
328
+ {
329
+ "loss": "MultipleNegativesRankingLoss",
330
+ "matryoshka_dims": [
331
+ 768,
332
+ 512
333
+ ],
334
+ "matryoshka_weights": [
335
+ 1,
336
+ 1
337
+ ],
338
+ "n_dims_per_step": -1
339
+ }
340
+ ```
341
+
342
+ ### Training Hyperparameters
343
+ #### Non-Default Hyperparameters
344
+
345
+ - `eval_strategy`: steps
346
+ - `per_device_train_batch_size`: 50
347
+ - `per_device_eval_batch_size`: 10
348
+ - `learning_rate`: 1e-05
349
+ - `bf16`: True
350
+ - `batch_sampler`: no_duplicates
351
+
352
+ #### All Hyperparameters
353
+ <details><summary>Click to expand</summary>
354
+
355
+ - `overwrite_output_dir`: False
356
+ - `do_predict`: False
357
+ - `eval_strategy`: steps
358
+ - `prediction_loss_only`: True
359
+ - `per_device_train_batch_size`: 50
360
+ - `per_device_eval_batch_size`: 10
361
+ - `per_gpu_train_batch_size`: None
362
+ - `per_gpu_eval_batch_size`: None
363
+ - `gradient_accumulation_steps`: 1
364
+ - `eval_accumulation_steps`: None
365
+ - `torch_empty_cache_steps`: None
366
+ - `learning_rate`: 1e-05
367
+ - `weight_decay`: 0.0
368
+ - `adam_beta1`: 0.9
369
+ - `adam_beta2`: 0.999
370
+ - `adam_epsilon`: 1e-08
371
+ - `max_grad_norm`: 1.0
372
+ - `num_train_epochs`: 3
373
+ - `max_steps`: -1
374
+ - `lr_scheduler_type`: linear
375
+ - `lr_scheduler_kwargs`: {}
376
+ - `warmup_ratio`: 0.0
377
+ - `warmup_steps`: 0
378
+ - `log_level`: passive
379
+ - `log_level_replica`: warning
380
+ - `log_on_each_node`: True
381
+ - `logging_nan_inf_filter`: True
382
+ - `save_safetensors`: True
383
+ - `save_on_each_node`: False
384
+ - `save_only_model`: False
385
+ - `restore_callback_states_from_checkpoint`: False
386
+ - `no_cuda`: False
387
+ - `use_cpu`: False
388
+ - `use_mps_device`: False
389
+ - `seed`: 42
390
+ - `data_seed`: None
391
+ - `jit_mode_eval`: False
392
+ - `use_ipex`: False
393
+ - `bf16`: True
394
+ - `fp16`: False
395
+ - `fp16_opt_level`: O1
396
+ - `half_precision_backend`: auto
397
+ - `bf16_full_eval`: False
398
+ - `fp16_full_eval`: False
399
+ - `tf32`: None
400
+ - `local_rank`: 0
401
+ - `ddp_backend`: None
402
+ - `tpu_num_cores`: None
403
+ - `tpu_metrics_debug`: False
404
+ - `debug`: []
405
+ - `dataloader_drop_last`: True
406
+ - `dataloader_num_workers`: 0
407
+ - `dataloader_prefetch_factor`: None
408
+ - `past_index`: -1
409
+ - `disable_tqdm`: False
410
+ - `remove_unused_columns`: True
411
+ - `label_names`: None
412
+ - `load_best_model_at_end`: False
413
+ - `ignore_data_skip`: False
414
+ - `fsdp`: []
415
+ - `fsdp_min_num_params`: 0
416
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
417
+ - `fsdp_transformer_layer_cls_to_wrap`: None
418
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
419
+ - `deepspeed`: None
420
+ - `label_smoothing_factor`: 0.0
421
+ - `optim`: adamw_torch
422
+ - `optim_args`: None
423
+ - `adafactor`: False
424
+ - `group_by_length`: False
425
+ - `length_column_name`: length
426
+ - `ddp_find_unused_parameters`: None
427
+ - `ddp_bucket_cap_mb`: None
428
+ - `ddp_broadcast_buffers`: False
429
+ - `dataloader_pin_memory`: True
430
+ - `dataloader_persistent_workers`: False
431
+ - `skip_memory_metrics`: True
432
+ - `use_legacy_prediction_loop`: False
433
+ - `push_to_hub`: False
434
+ - `resume_from_checkpoint`: None
435
+ - `hub_model_id`: None
436
+ - `hub_strategy`: every_save
437
+ - `hub_private_repo`: False
438
+ - `hub_always_push`: False
439
+ - `gradient_checkpointing`: False
440
+ - `gradient_checkpointing_kwargs`: None
441
+ - `include_inputs_for_metrics`: False
442
+ - `eval_do_concat_batches`: True
443
+ - `fp16_backend`: auto
444
+ - `push_to_hub_model_id`: None
445
+ - `push_to_hub_organization`: None
446
+ - `mp_parameters`:
447
+ - `auto_find_batch_size`: False
448
+ - `full_determinism`: False
449
+ - `torchdynamo`: None
450
+ - `ray_scope`: last
451
+ - `ddp_timeout`: 1800
452
+ - `torch_compile`: False
453
+ - `torch_compile_backend`: None
454
+ - `torch_compile_mode`: None
455
+ - `dispatch_batches`: None
456
+ - `split_batches`: None
457
+ - `include_tokens_per_second`: False
458
+ - `include_num_input_tokens_seen`: False
459
+ - `neftune_noise_alpha`: None
460
+ - `optim_target_modules`: None
461
+ - `batch_eval_metrics`: False
462
+ - `eval_on_start`: False
463
+ - `use_liger_kernel`: False
464
+ - `eval_use_gather_object`: False
465
+ - `batch_sampler`: no_duplicates
466
+ - `multi_dataset_batch_sampler`: proportional
467
+
468
+ </details>
469
+
470
+ ### Training Logs
471
+ <details><summary>Click to expand</summary>
472
+
473
+ | Epoch | Step | Training Loss | Validation Loss | sts-dev-768_spearman_cosine | sts-dev-512_spearman_cosine |
474
+ |:------:|:-----:|:-------------:|:---------------:|:---------------------------:|:---------------------------:|
475
+ | 0.0044 | 50 | - | 0.7749 | 0.7784 | 0.7748 |
476
+ | 0.0088 | 100 | - | 0.6231 | 0.7854 | 0.7809 |
477
+ | 0.0132 | 150 | - | 0.5326 | 0.8028 | 0.7992 |
478
+ | 0.0175 | 200 | - | 0.4880 | 0.8103 | 0.8047 |
479
+ | 0.0219 | 250 | 1.1802 | 0.4398 | 0.8084 | 0.8043 |
480
+ | 0.0263 | 300 | - | 0.4203 | 0.8108 | 0.8058 |
481
+ | 0.0307 | 350 | - | 0.3880 | 0.8134 | 0.8075 |
482
+ | 0.0351 | 400 | - | 0.3998 | 0.8180 | 0.8145 |
483
+ | 0.0395 | 450 | - | 0.3840 | 0.8154 | 0.8114 |
484
+ | 0.0439 | 500 | 0.7483 | 0.3804 | 0.8105 | 0.8056 |
485
+ | 0.0483 | 550 | - | 0.3695 | 0.8147 | 0.8103 |
486
+ | 0.0526 | 600 | - | 0.3649 | 0.8145 | 0.8101 |
487
+ | 0.0570 | 650 | - | 0.3494 | 0.8192 | 0.8157 |
488
+ | 0.0614 | 700 | - | 0.3437 | 0.8159 | 0.8106 |
489
+ | 0.0658 | 750 | 0.6561 | 0.3302 | 0.8158 | 0.8104 |
490
+ | 0.0702 | 800 | - | 0.3359 | 0.8204 | 0.8174 |
491
+ | 0.0746 | 850 | - | 0.3446 | 0.8119 | 0.8094 |
492
+ | 0.0790 | 900 | - | 0.3419 | 0.8265 | 0.8252 |
493
+ | 0.0833 | 950 | - | 0.3197 | 0.8177 | 0.8141 |
494
+ | 0.0877 | 1000 | 0.6178 | 0.3250 | 0.8213 | 0.8185 |
495
+ | 0.0921 | 1050 | - | 0.3017 | 0.8161 | 0.8127 |
496
+ | 0.0965 | 1100 | - | 0.3058 | 0.8232 | 0.8180 |
497
+ | 0.1009 | 1150 | - | 0.3066 | 0.8236 | 0.8193 |
498
+ | 0.1053 | 1200 | - | 0.2924 | 0.8275 | 0.8237 |
499
+ | 0.1097 | 1250 | 0.5633 | 0.3096 | 0.8206 | 0.8173 |
500
+ | 0.1141 | 1300 | - | 0.3009 | 0.8299 | 0.8277 |
501
+ | 0.1184 | 1350 | - | 0.3067 | 0.8158 | 0.8111 |
502
+ | 0.1228 | 1400 | - | 0.2898 | 0.8215 | 0.8180 |
503
+ | 0.1272 | 1450 | - | 0.2810 | 0.8272 | 0.8261 |
504
+ | 0.1316 | 1500 | 0.5337 | 0.2810 | 0.8228 | 0.8187 |
505
+ | 0.1360 | 1550 | - | 0.2772 | 0.8167 | 0.8139 |
506
+ | 0.1404 | 1600 | - | 0.2772 | 0.8228 | 0.8194 |
507
+ | 0.1448 | 1650 | - | 0.2751 | 0.8193 | 0.8153 |
508
+ | 0.1491 | 1700 | - | 0.2579 | 0.8182 | 0.8147 |
509
+ | 0.1535 | 1750 | 0.5154 | 0.2542 | 0.8199 | 0.8166 |
510
+ | 0.1579 | 1800 | - | 0.2607 | 0.8243 | 0.8224 |
511
+ | 0.1623 | 1850 | - | 0.2595 | 0.8280 | 0.8254 |
512
+ | 0.1667 | 1900 | - | 0.2612 | 0.8272 | 0.8255 |
513
+ | 0.1711 | 1950 | - | 0.2644 | 0.8273 | 0.8242 |
514
+ | 0.1755 | 2000 | 0.4838 | 0.2618 | 0.8276 | 0.8246 |
515
+ | 0.1799 | 2050 | - | 0.2553 | 0.8219 | 0.8200 |
516
+ | 0.1842 | 2100 | - | 0.2581 | 0.8232 | 0.8217 |
517
+ | 0.1886 | 2150 | - | 0.2620 | 0.8254 | 0.8232 |
518
+ | 0.1930 | 2200 | - | 0.2627 | 0.8235 | 0.8193 |
519
+ | 0.1974 | 2250 | 0.486 | 0.2597 | 0.8170 | 0.8142 |
520
+ | 0.2018 | 2300 | - | 0.2605 | 0.8261 | 0.8231 |
521
+ | 0.2062 | 2350 | - | 0.2584 | 0.8252 | 0.8222 |
522
+ | 0.2106 | 2400 | - | 0.2663 | 0.8247 | 0.8228 |
523
+ | 0.2149 | 2450 | - | 0.2527 | 0.8285 | 0.8280 |
524
+ | 0.2193 | 2500 | 0.4523 | 0.2487 | 0.8291 | 0.8270 |
525
+ | 0.2237 | 2550 | - | 0.2524 | 0.8257 | 0.8244 |
526
+ | 0.2281 | 2600 | - | 0.2513 | 0.8228 | 0.8210 |
527
+ | 0.2325 | 2650 | - | 0.2531 | 0.8287 | 0.8265 |
528
+ | 0.2369 | 2700 | - | 0.2510 | 0.8224 | 0.8198 |
529
+ | 0.2413 | 2750 | 0.4522 | 0.2523 | 0.8275 | 0.8260 |
530
+ | 0.2457 | 2800 | - | 0.2563 | 0.8301 | 0.8278 |
531
+ | 0.2500 | 2850 | - | 0.2531 | 0.8242 | 0.8242 |
532
+ | 0.2544 | 2900 | - | 0.2527 | 0.8268 | 0.8268 |
533
+ | 0.2588 | 2950 | - | 0.2465 | 0.8228 | 0.8223 |
534
+ | 0.2632 | 3000 | 0.4472 | 0.2422 | 0.8263 | 0.8237 |
535
+ | 0.2676 | 3050 | - | 0.2484 | 0.8223 | 0.8195 |
536
+ | 0.2720 | 3100 | - | 0.2469 | 0.8209 | 0.8206 |
537
+ | 0.2764 | 3150 | - | 0.2419 | 0.8283 | 0.8281 |
538
+ | 0.2808 | 3200 | - | 0.2370 | 0.8303 | 0.8286 |
539
+ | 0.2851 | 3250 | 0.4499 | 0.2374 | 0.8293 | 0.8275 |
540
+ | 0.2895 | 3300 | - | 0.2340 | 0.8255 | 0.8255 |
541
+ | 0.2939 | 3350 | - | 0.2461 | 0.8277 | 0.8292 |
542
+ | 0.2983 | 3400 | - | 0.2421 | 0.8320 | 0.8307 |
543
+ | 0.3027 | 3450 | - | 0.2366 | 0.8286 | 0.8281 |
544
+ | 0.3071 | 3500 | 0.4305 | 0.2389 | 0.8312 | 0.8293 |
545
+ | 0.3115 | 3550 | - | 0.2360 | 0.8305 | 0.8310 |
546
+ | 0.3158 | 3600 | - | 0.2313 | 0.8271 | 0.8256 |
547
+ | 0.3202 | 3650 | - | 0.2182 | 0.8231 | 0.8197 |
548
+ | 0.3246 | 3700 | - | 0.2220 | 0.8274 | 0.8246 |
549
+ | 0.3290 | 3750 | 0.4221 | 0.2305 | 0.8301 | 0.8292 |
550
+ | 0.3334 | 3800 | - | 0.2244 | 0.8285 | 0.8265 |
551
+ | 0.3378 | 3850 | - | 0.2355 | 0.8349 | 0.8331 |
552
+ | 0.3422 | 3900 | - | 0.2256 | 0.8355 | 0.8330 |
553
+ | 0.3466 | 3950 | - | 0.2273 | 0.8330 | 0.8299 |
554
+ | 0.3509 | 4000 | 0.4203 | 0.2334 | 0.8304 | 0.8275 |
555
+ | 0.3553 | 4050 | - | 0.2223 | 0.8323 | 0.8305 |
556
+ | 0.3597 | 4100 | - | 0.2314 | 0.8323 | 0.8299 |
557
+ | 0.3641 | 4150 | - | 0.2196 | 0.8272 | 0.8244 |
558
+ | 0.3685 | 4200 | - | 0.2275 | 0.8342 | 0.8353 |
559
+ | 0.3729 | 4250 | 0.4039 | 0.2209 | 0.8348 | 0.8333 |
560
+ | 0.3773 | 4300 | - | 0.2152 | 0.8314 | 0.8307 |
561
+ | 0.3816 | 4350 | - | 0.2115 | 0.8353 | 0.8325 |
562
+ | 0.3860 | 4400 | - | 0.2195 | 0.8347 | 0.8310 |
563
+ | 0.3904 | 4450 | - | 0.2110 | 0.8293 | 0.8264 |
564
+ | 0.3948 | 4500 | 0.4065 | 0.2115 | 0.8321 | 0.8293 |
565
+ | 0.3992 | 4550 | - | 0.2139 | 0.8312 | 0.8286 |
566
+ | 0.4036 | 4600 | - | 0.2145 | 0.8319 | 0.8285 |
567
+ | 0.4080 | 4650 | - | 0.2127 | 0.8281 | 0.8255 |
568
+ | 0.4124 | 4700 | - | 0.2122 | 0.8292 | 0.8268 |
569
+ | 0.4167 | 4750 | 0.4019 | 0.2160 | 0.8354 | 0.8329 |
570
+ | 0.4211 | 4800 | - | 0.2069 | 0.8296 | 0.8258 |
571
+ | 0.4255 | 4850 | - | 0.2106 | 0.8362 | 0.8335 |
572
+ | 0.4299 | 4900 | - | 0.2130 | 0.8345 | 0.8321 |
573
+ | 0.4343 | 4950 | - | 0.2080 | 0.8307 | 0.8277 |
574
+ | 0.4387 | 5000 | 0.3941 | 0.2184 | 0.8394 | 0.8370 |
575
+ | 0.4431 | 5050 | - | 0.2061 | 0.8334 | 0.8325 |
576
+ | 0.4474 | 5100 | - | 0.2092 | 0.8318 | 0.8307 |
577
+ | 0.4518 | 5150 | - | 0.2108 | 0.8319 | 0.8289 |
578
+ | 0.4562 | 5200 | - | 0.2046 | 0.8359 | 0.8337 |
579
+ | 0.4606 | 5250 | 0.3873 | 0.1990 | 0.8327 | 0.8305 |
580
+ | 0.4650 | 5300 | - | 0.2007 | 0.8332 | 0.8305 |
581
+ | 0.4694 | 5350 | - | 0.1989 | 0.8284 | 0.8247 |
582
+ | 0.4738 | 5400 | - | 0.2117 | 0.8363 | 0.8346 |
583
+ | 0.4782 | 5450 | - | 0.2036 | 0.8329 | 0.8296 |
584
+ | 0.4825 | 5500 | 0.3808 | 0.1999 | 0.8341 | 0.8295 |
585
+ | 0.4869 | 5550 | - | 0.1998 | 0.8336 | 0.8300 |
586
+ | 0.4913 | 5600 | - | 0.2040 | 0.8348 | 0.8331 |
587
+ | 0.4957 | 5650 | - | 0.2068 | 0.8367 | 0.8346 |
588
+ | 0.5001 | 5700 | - | 0.1947 | 0.8333 | 0.8305 |
589
+ | 0.5045 | 5750 | 0.3779 | 0.1969 | 0.8352 | 0.8329 |
590
+ | 0.5089 | 5800 | - | 0.2028 | 0.8372 | 0.8369 |
591
+ | 0.5132 | 5850 | - | 0.2029 | 0.8336 | 0.8319 |
592
+ | 0.5176 | 5900 | - | 0.2029 | 0.8317 | 0.8309 |
593
+ | 0.5220 | 5950 | - | 0.2059 | 0.8270 | 0.8270 |
594
+ | 0.5264 | 6000 | 0.3704 | 0.1997 | 0.8263 | 0.8236 |
595
+ | 0.5308 | 6050 | - | 0.2001 | 0.8280 | 0.8252 |
596
+ | 0.5352 | 6100 | - | 0.1985 | 0.8275 | 0.8241 |
597
+ | 0.5396 | 6150 | - | 0.1976 | 0.8281 | 0.8281 |
598
+ | 0.5440 | 6200 | - | 0.1987 | 0.8270 | 0.8247 |
599
+ | 0.5483 | 6250 | 0.3722 | 0.2045 | 0.8320 | 0.8303 |
600
+ | 0.5527 | 6300 | - | 0.2013 | 0.8292 | 0.8278 |
601
+ | 0.5571 | 6350 | - | 0.2007 | 0.8302 | 0.8279 |
602
+ | 0.5615 | 6400 | - | 0.1949 | 0.8297 | 0.8274 |
603
+ | 0.5659 | 6450 | - | 0.2037 | 0.8335 | 0.8313 |
604
+ | 0.5703 | 6500 | 0.3638 | 0.2060 | 0.8316 | 0.8280 |
605
+ | 0.5747 | 6550 | - | 0.2030 | 0.8372 | 0.8348 |
606
+ | 0.5790 | 6600 | - | 0.1982 | 0.8317 | 0.8295 |
607
+ | 0.5834 | 6650 | - | 0.2075 | 0.8324 | 0.8325 |
608
+ | 0.5878 | 6700 | - | 0.2014 | 0.8306 | 0.8284 |
609
+ | 0.5922 | 6750 | 0.3581 | 0.1983 | 0.8360 | 0.8344 |
610
+ | 0.5966 | 6800 | - | 0.2007 | 0.8337 | 0.8313 |
611
+ | 0.6010 | 6850 | - | 0.2003 | 0.8349 | 0.8338 |
612
+ | 0.6054 | 6900 | - | 0.2018 | 0.8313 | 0.8305 |
613
+ | 0.6098 | 6950 | - | 0.1978 | 0.8323 | 0.8307 |
614
+ | 0.6141 | 7000 | 0.3596 | 0.1991 | 0.8370 | 0.8340 |
615
+ | 0.6185 | 7050 | - | 0.1963 | 0.8330 | 0.8302 |
616
+ | 0.6229 | 7100 | - | 0.1918 | 0.8334 | 0.8320 |
617
+ | 0.6273 | 7150 | - | 0.2008 | 0.8338 | 0.8327 |
618
+ | 0.6317 | 7200 | - | 0.1973 | 0.8320 | 0.8295 |
619
+ | 0.6361 | 7250 | 0.3614 | 0.1891 | 0.8339 | 0.8322 |
620
+ | 0.6405 | 7300 | - | 0.1961 | 0.8355 | 0.8332 |
621
+ | 0.6448 | 7350 | - | 0.1910 | 0.8322 | 0.8304 |
622
+ | 0.6492 | 7400 | - | 0.1926 | 0.8343 | 0.8331 |
623
+ | 0.6536 | 7450 | - | 0.1935 | 0.8310 | 0.8292 |
624
+ | 0.6580 | 7500 | 0.3513 | 0.1969 | 0.8337 | 0.8346 |
625
+ | 0.6624 | 7550 | - | 0.1891 | 0.8331 | 0.8311 |
626
+ | 0.6668 | 7600 | - | 0.1932 | 0.8369 | 0.8341 |
627
+ | 0.6712 | 7650 | - | 0.2041 | 0.8370 | 0.8357 |
628
+ | 0.6756 | 7700 | - | 0.1946 | 0.8335 | 0.8314 |
629
+ | 0.6799 | 7750 | 0.3426 | 0.1955 | 0.8364 | 0.8330 |
630
+ | 0.6843 | 7800 | - | 0.1940 | 0.8316 | 0.8307 |
631
+ | 0.6887 | 7850 | - | 0.1893 | 0.8323 | 0.8322 |
632
+ | 0.6931 | 7900 | - | 0.1839 | 0.8296 | 0.8286 |
633
+ | 0.6975 | 7950 | - | 0.1895 | 0.8321 | 0.8296 |
634
+ | 0.7019 | 8000 | 0.3406 | 0.1901 | 0.8277 | 0.8263 |
635
+ | 0.7063 | 8050 | - | 0.1835 | 0.8331 | 0.8284 |
636
+ | 0.7107 | 8100 | - | 0.1847 | 0.8359 | 0.8342 |
637
+ | 0.7150 | 8150 | - | 0.1892 | 0.8362 | 0.8348 |
638
+ | 0.7194 | 8200 | - | 0.1775 | 0.8339 | 0.8305 |
639
+ | 0.7238 | 8250 | 0.3357 | 0.1921 | 0.8359 | 0.8340 |
640
+ | 0.7282 | 8300 | - | 0.1881 | 0.8369 | 0.8344 |
641
+ | 0.7326 | 8350 | - | 0.1891 | 0.8371 | 0.8363 |
642
+ | 0.7370 | 8400 | - | 0.1880 | 0.8394 | 0.8364 |
643
+ | 0.7414 | 8450 | - | 0.1892 | 0.8348 | 0.8306 |
644
+ | 0.7457 | 8500 | 0.327 | 0.1868 | 0.8388 | 0.8353 |
645
+ | 0.7501 | 8550 | - | 0.1815 | 0.8378 | 0.8352 |
646
+ | 0.7545 | 8600 | - | 0.1877 | 0.8398 | 0.8370 |
647
+ | 0.7589 | 8650 | - | 0.1878 | 0.8392 | 0.8378 |
648
+ | 0.7633 | 8700 | - | 0.1778 | 0.8330 | 0.8304 |
649
+ | 0.7677 | 8750 | 0.3288 | 0.1791 | 0.8390 | 0.8360 |
650
+ | 0.7721 | 8800 | - | 0.1803 | 0.8298 | 0.8270 |
651
+ | 0.7765 | 8850 | - | 0.1803 | 0.8358 | 0.8323 |
652
+ | 0.7808 | 8900 | - | 0.1832 | 0.8330 | 0.8322 |
653
+ | 0.7852 | 8950 | - | 0.1767 | 0.8316 | 0.8286 |
654
+ | 0.7896 | 9000 | 0.329 | 0.1808 | 0.8283 | 0.8254 |
655
+ | 0.7940 | 9050 | - | 0.1842 | 0.8331 | 0.8293 |
656
+ | 0.7984 | 9100 | - | 0.1750 | 0.8304 | 0.8275 |
657
+ | 0.8028 | 9150 | - | 0.1779 | 0.8299 | 0.8270 |
658
+ | 0.8072 | 9200 | - | 0.1799 | 0.8332 | 0.8332 |
659
+ | 0.8115 | 9250 | 0.3283 | 0.1872 | 0.8399 | 0.8371 |
660
+ | 0.8159 | 9300 | - | 0.1842 | 0.8364 | 0.8352 |
661
+ | 0.8203 | 9350 | - | 0.1785 | 0.8415 | 0.8382 |
662
+ | 0.8247 | 9400 | - | 0.1822 | 0.8432 | 0.8407 |
663
+ | 0.8291 | 9450 | - | 0.1745 | 0.8380 | 0.8364 |
664
+ | 0.8335 | 9500 | 0.3271 | 0.1745 | 0.8374 | 0.8352 |
665
+ | 0.8379 | 9550 | - | 0.1746 | 0.8363 | 0.8332 |
666
+ | 0.8423 | 9600 | - | 0.1776 | 0.8391 | 0.8374 |
667
+ | 0.8466 | 9650 | - | 0.1760 | 0.8379 | 0.8353 |
668
+ | 0.8510 | 9700 | - | 0.1806 | 0.8360 | 0.8335 |
669
+ | 0.8554 | 9750 | 0.3309 | 0.1822 | 0.8368 | 0.8337 |
670
+ | 0.8598 | 9800 | - | 0.1765 | 0.8366 | 0.8336 |
671
+ | 0.8642 | 9850 | - | 0.1766 | 0.8353 | 0.8323 |
672
+ | 0.8686 | 9900 | - | 0.1698 | 0.8353 | 0.8315 |
673
+ | 0.8730 | 9950 | - | 0.1715 | 0.8378 | 0.8338 |
674
+ | 0.8773 | 10000 | 0.318 | 0.1782 | 0.8396 | 0.8357 |
675
+ | 0.8817 | 10050 | - | 0.1727 | 0.8382 | 0.8368 |
676
+ | 0.8861 | 10100 | - | 0.1740 | 0.8356 | 0.8330 |
677
+ | 0.8905 | 10150 | - | 0.1723 | 0.8347 | 0.8319 |
678
+ | 0.8949 | 10200 | - | 0.1656 | 0.8336 | 0.8314 |
679
+ | 0.8993 | 10250 | 0.3284 | 0.1742 | 0.8288 | 0.8264 |
680
+ | 0.9037 | 10300 | - | 0.1679 | 0.8315 | 0.8296 |
681
+ | 0.9081 | 10350 | - | 0.1694 | 0.8325 | 0.8296 |
682
+ | 0.9124 | 10400 | - | 0.1723 | 0.8319 | 0.8305 |
683
+ | 0.9168 | 10450 | - | 0.1638 | 0.8340 | 0.8310 |
684
+ | 0.9212 | 10500 | 0.313 | 0.1730 | 0.8371 | 0.8368 |
685
+ | 0.9256 | 10550 | - | 0.1639 | 0.8351 | 0.8327 |
686
+ | 0.9300 | 10600 | - | 0.1634 | 0.8379 | 0.8350 |
687
+ | 0.9344 | 10650 | - | 0.1745 | 0.8353 | 0.8340 |
688
+ | 0.9388 | 10700 | - | 0.1731 | 0.8349 | 0.8346 |
689
+ | 0.9431 | 10750 | 0.3145 | 0.1668 | 0.8333 | 0.8314 |
690
+ | 0.9475 | 10800 | - | 0.1653 | 0.8351 | 0.8338 |
691
+ | 0.9519 | 10850 | - | 0.1655 | 0.8401 | 0.8390 |
692
+ | 0.9563 | 10900 | - | 0.1708 | 0.8376 | 0.8360 |
693
+ | 0.9607 | 10950 | - | 0.1740 | 0.8382 | 0.8364 |
694
+ | 0.9651 | 11000 | 0.3002 | 0.1714 | 0.8401 | 0.8382 |
695
+ | 0.9695 | 11050 | - | 0.1647 | 0.8411 | 0.8393 |
696
+ | 0.9739 | 11100 | - | 0.1701 | 0.8418 | 0.8396 |
697
+ | 0.9782 | 11150 | - | 0.1665 | 0.8394 | 0.8379 |
698
+ | 0.9826 | 11200 | - | 0.1652 | 0.8377 | 0.8376 |
699
+ | 0.9870 | 11250 | 0.3094 | 0.1665 | 0.8408 | 0.8397 |
700
+ | 0.9914 | 11300 | - | 0.1689 | 0.8412 | 0.8393 |
701
+ | 0.9958 | 11350 | - | 0.1674 | 0.8400 | 0.8374 |
702
+ | 1.0002 | 11400 | - | 0.1694 | 0.8395 | 0.8376 |
703
+ | 1.0046 | 11450 | - | 0.1697 | 0.8434 | 0.8419 |
704
+ | 1.0089 | 11500 | 0.3004 | 0.1640 | 0.8399 | 0.8388 |
705
+ | 1.0133 | 11550 | - | 0.1731 | 0.8445 | 0.8426 |
706
+ | 1.0177 | 11600 | - | 0.1618 | 0.8430 | 0.8389 |
707
+ | 1.0221 | 11650 | - | 0.1646 | 0.8414 | 0.8377 |
708
+ | 1.0265 | 11700 | - | 0.1679 | 0.8435 | 0.8401 |
709
+ | 1.0309 | 11750 | 0.2984 | 0.1646 | 0.8413 | 0.8385 |
710
+ | 1.0353 | 11800 | - | 0.1797 | 0.8465 | 0.8432 |
711
+ | 1.0397 | 11850 | - | 0.1758 | 0.8393 | 0.8390 |
712
+ | 1.0440 | 11900 | - | 0.1690 | 0.8401 | 0.8379 |
713
+ | 1.0484 | 11950 | - | 0.1735 | 0.8423 | 0.8404 |
714
+ | 1.0528 | 12000 | 0.2896 | 0.1719 | 0.8384 | 0.8367 |
715
+ | 1.0572 | 12050 | - | 0.1759 | 0.8420 | 0.8403 |
716
+ | 1.0616 | 12100 | - | 0.1659 | 0.8360 | 0.8340 |
717
+ | 1.0660 | 12150 | - | 0.1645 | 0.8368 | 0.8362 |
718
+ | 1.0704 | 12200 | - | 0.1601 | 0.8380 | 0.8355 |
719
+ | 1.0747 | 12250 | 0.2954 | 0.1711 | 0.8406 | 0.8387 |
720
+ | 1.0791 | 12300 | - | 0.1691 | 0.8389 | 0.8370 |
721
+ | 1.0835 | 12350 | - | 0.1721 | 0.8397 | 0.8385 |
722
+ | 1.0879 | 12400 | - | 0.1689 | 0.8379 | 0.8351 |
723
+ | 1.0923 | 12450 | - | 0.1663 | 0.8424 | 0.8402 |
724
+ | 1.0967 | 12500 | 0.2864 | 0.1672 | 0.8418 | 0.8403 |
725
+ | 1.1011 | 12550 | - | 0.1689 | 0.8389 | 0.8386 |
726
+ | 1.1055 | 12600 | - | 0.1664 | 0.8410 | 0.8402 |
727
+ | 1.1098 | 12650 | - | 0.1685 | 0.8387 | 0.8376 |
728
+ | 1.1142 | 12700 | - | 0.1715 | 0.8419 | 0.8402 |
729
+ | 1.1186 | 12750 | 0.2745 | 0.1607 | 0.8373 | 0.8336 |
730
+ | 1.1230 | 12800 | - | 0.1620 | 0.8388 | 0.8379 |
731
+ | 1.1274 | 12850 | - | 0.1623 | 0.8417 | 0.8396 |
732
+ | 1.1318 | 12900 | - | 0.1589 | 0.8360 | 0.8342 |
733
+ | 1.1362 | 12950 | - | 0.1567 | 0.8300 | 0.8298 |
734
+ | 1.1406 | 13000 | 0.2768 | 0.1557 | 0.8406 | 0.8365 |
735
+ | 1.1449 | 13050 | - | 0.1581 | 0.8389 | 0.8363 |
736
+ | 1.1493 | 13100 | - | 0.1611 | 0.8399 | 0.8366 |
737
+ | 1.1537 | 13150 | - | 0.1583 | 0.8358 | 0.8348 |
738
+ | 1.1581 | 13200 | - | 0.1619 | 0.8405 | 0.8387 |
739
+ | 1.1625 | 13250 | 0.2737 | 0.1567 | 0.8373 | 0.8339 |
740
+ | 1.1669 | 13300 | - | 0.1642 | 0.8393 | 0.8374 |
741
+ | 1.1713 | 13350 | - | 0.1646 | 0.8404 | 0.8376 |
742
+ | 1.1756 | 13400 | - | 0.1601 | 0.8419 | 0.8402 |
743
+ | 1.1800 | 13450 | - | 0.1648 | 0.8412 | 0.8391 |
744
+ | 1.1844 | 13500 | 0.2627 | 0.1635 | 0.8403 | 0.8403 |
745
+ | 1.1888 | 13550 | - | 0.1662 | 0.8427 | 0.8407 |
746
+ | 1.1932 | 13600 | - | 0.1687 | 0.8381 | 0.8368 |
747
+ | 1.1976 | 13650 | - | 0.1693 | 0.8366 | 0.8365 |
748
+ | 1.2020 | 13700 | - | 0.1665 | 0.8410 | 0.8397 |
749
+ | 1.2064 | 13750 | 0.2738 | 0.1665 | 0.8373 | 0.8360 |
750
+ | 1.2107 | 13800 | - | 0.1667 | 0.8388 | 0.8389 |
751
+ | 1.2151 | 13850 | - | 0.1674 | 0.8455 | 0.8413 |
752
+ | 1.2195 | 13900 | - | 0.1704 | 0.8419 | 0.8382 |
753
+ | 1.2239 | 13950 | - | 0.1654 | 0.8417 | 0.8398 |
754
+ | 1.2283 | 14000 | 0.2563 | 0.1610 | 0.8414 | 0.8403 |
755
+ | 1.2327 | 14050 | - | 0.1625 | 0.8416 | 0.8380 |
756
+ | 1.2371 | 14100 | - | 0.1705 | 0.8411 | 0.8400 |
757
+ | 1.2414 | 14150 | - | 0.1628 | 0.8400 | 0.8384 |
758
+ | 1.2458 | 14200 | - | 0.1667 | 0.8448 | 0.8435 |
759
+ | 1.2502 | 14250 | 0.2693 | 0.1651 | 0.8406 | 0.8396 |
760
+ | 1.2546 | 14300 | - | 0.1673 | 0.8404 | 0.8388 |
761
+ | 1.2590 | 14350 | - | 0.1630 | 0.8392 | 0.8375 |
762
+ | 1.2634 | 14400 | - | 0.1633 | 0.8413 | 0.8403 |
763
+ | 1.2678 | 14450 | - | 0.1636 | 0.8412 | 0.8398 |
764
+ | 1.2722 | 14500 | 0.266 | 0.1613 | 0.8404 | 0.8379 |
765
+ | 1.2765 | 14550 | - | 0.1625 | 0.8392 | 0.8380 |
766
+ | 1.2809 | 14600 | - | 0.1634 | 0.8418 | 0.8397 |
767
+ | 1.2853 | 14650 | - | 0.1689 | 0.8426 | 0.8428 |
768
+ | 1.2897 | 14700 | - | 0.1617 | 0.8410 | 0.8405 |
769
+ | 1.2941 | 14750 | 0.2643 | 0.1661 | 0.8437 | 0.8417 |
770
+ | 1.2985 | 14800 | - | 0.1629 | 0.8409 | 0.8394 |
771
+ | 1.3029 | 14850 | - | 0.1584 | 0.8413 | 0.8387 |
772
+ | 1.3072 | 14900 | - | 0.1638 | 0.8446 | 0.8433 |
773
+ | 1.3116 | 14950 | - | 0.1644 | 0.8429 | 0.8426 |
774
+ | 1.3160 | 15000 | 0.2624 | 0.1570 | 0.8391 | 0.8386 |
775
+ | 1.3204 | 15050 | - | 0.1535 | 0.8367 | 0.8348 |
776
+ | 1.3248 | 15100 | - | 0.1591 | 0.8381 | 0.8367 |
777
+ | 1.3292 | 15150 | - | 0.1618 | 0.8421 | 0.8409 |
778
+ | 1.3336 | 15200 | - | 0.1554 | 0.8402 | 0.8381 |
779
+ | 1.3380 | 15250 | 0.2621 | 0.1595 | 0.8431 | 0.8427 |
780
+ | 1.3423 | 15300 | - | 0.1595 | 0.8447 | 0.8435 |
781
+ | 1.3467 | 15350 | - | 0.1585 | 0.8408 | 0.8394 |
782
+ | 1.3511 | 15400 | - | 0.1635 | 0.8403 | 0.8389 |
783
+ | 1.3555 | 15450 | - | 0.1569 | 0.8453 | 0.8444 |
784
+ | 1.3599 | 15500 | 0.2552 | 0.1605 | 0.8434 | 0.8412 |
785
+ | 1.3643 | 15550 | - | 0.1542 | 0.8420 | 0.8397 |
786
+ | 1.3687 | 15600 | - | 0.1622 | 0.8456 | 0.8451 |
787
+ | 1.3730 | 15650 | - | 0.1569 | 0.8466 | 0.8443 |
788
+ | 1.3774 | 15700 | - | 0.1550 | 0.8440 | 0.8416 |
789
+ | 1.3818 | 15750 | 0.2532 | 0.1569 | 0.8459 | 0.8445 |
790
+ | 1.3862 | 15800 | - | 0.1567 | 0.8462 | 0.8451 |
791
+ | 1.3906 | 15850 | - | 0.1504 | 0.8442 | 0.8422 |
792
+ | 1.3950 | 15900 | - | 0.1524 | 0.8437 | 0.8419 |
793
+ | 1.3994 | 15950 | - | 0.1491 | 0.8438 | 0.8413 |
794
+ | 1.4038 | 16000 | 0.265 | 0.1533 | 0.8428 | 0.8406 |
795
+ | 1.4081 | 16050 | - | 0.1492 | 0.8425 | 0.8399 |
796
+ | 1.4125 | 16100 | - | 0.1486 | 0.8410 | 0.8386 |
797
+ | 1.4169 | 16150 | - | 0.1530 | 0.8458 | 0.8433 |
798
+ | 1.4213 | 16200 | - | 0.1535 | 0.8437 | 0.8427 |
799
+ | 1.4257 | 16250 | 0.2512 | 0.1508 | 0.8453 | 0.8446 |
800
+ | 1.4301 | 16300 | - | 0.1540 | 0.8427 | 0.8411 |
801
+ | 1.4345 | 16350 | - | 0.1513 | 0.8414 | 0.8388 |
802
+ | 1.4388 | 16400 | - | 0.1553 | 0.8464 | 0.8461 |
803
+ | 1.4432 | 16450 | - | 0.1528 | 0.8434 | 0.8412 |
804
+ | 1.4476 | 16500 | 0.2545 | 0.1522 | 0.8419 | 0.8399 |
805
+ | 1.4520 | 16550 | - | 0.1521 | 0.8423 | 0.8416 |
806
+ | 1.4564 | 16600 | - | 0.1433 | 0.8427 | 0.8410 |
807
+ | 1.4608 | 16650 | - | 0.1500 | 0.8419 | 0.8401 |
808
+ | 1.4652 | 16700 | - | 0.1442 | 0.8425 | 0.8392 |
809
+ | 1.4696 | 16750 | 0.2549 | 0.1496 | 0.8397 | 0.8376 |
810
+ | 1.4739 | 16800 | - | 0.1556 | 0.8463 | 0.8435 |
811
+ | 1.4783 | 16850 | - | 0.1510 | 0.8458 | 0.8432 |
812
+ | 1.4827 | 16900 | - | 0.1469 | 0.8431 | 0.8423 |
813
+ | 1.4871 | 16950 | - | 0.1481 | 0.8456 | 0.8441 |
814
+ | 1.4915 | 17000 | 0.2522 | 0.1512 | 0.8456 | 0.8437 |
815
+ | 1.4959 | 17050 | - | 0.1471 | 0.8455 | 0.8430 |
816
+ | 1.5003 | 17100 | - | 0.1397 | 0.8409 | 0.8383 |
817
+ | 1.5046 | 17150 | - | 0.1414 | 0.8427 | 0.8404 |
818
+ | 1.5090 | 17200 | - | 0.1474 | 0.8432 | 0.8420 |
819
+ | 1.5134 | 17250 | 0.2489 | 0.1499 | 0.8414 | 0.8412 |
820
+ | 1.5178 | 17300 | - | 0.1442 | 0.8390 | 0.8376 |
821
+ | 1.5222 | 17350 | - | 0.1474 | 0.8373 | 0.8370 |
822
+ | 1.5266 | 17400 | - | 0.1435 | 0.8353 | 0.8352 |
823
+ | 1.5310 | 17450 | - | 0.1461 | 0.8380 | 0.8363 |
824
+ | 1.5354 | 17500 | 0.2493 | 0.1477 | 0.8362 | 0.8353 |
825
+ | 1.5397 | 17550 | - | 0.1503 | 0.8398 | 0.8385 |
826
+ | 1.5441 | 17600 | - | 0.1474 | 0.8372 | 0.8376 |
827
+ | 1.5485 | 17650 | - | 0.1499 | 0.8408 | 0.8390 |
828
+ | 1.5529 | 17700 | - | 0.1501 | 0.8386 | 0.8369 |
829
+ | 1.5573 | 17750 | 0.2499 | 0.1474 | 0.8367 | 0.8351 |
830
+ | 1.5617 | 17800 | - | 0.1406 | 0.8380 | 0.8362 |
831
+ | 1.5661 | 17850 | - | 0.1457 | 0.8399 | 0.8396 |
832
+ | 1.5705 | 17900 | - | 0.1486 | 0.8409 | 0.8399 |
833
+ | 1.5748 | 17950 | - | 0.1493 | 0.8407 | 0.8397 |
834
+ | 1.5792 | 18000 | 0.2419 | 0.1490 | 0.8400 | 0.8386 |
835
+ | 1.5836 | 18050 | - | 0.1496 | 0.8403 | 0.8388 |
836
+ | 1.5880 | 18100 | - | 0.1509 | 0.8422 | 0.8401 |
837
+ | 1.5924 | 18150 | - | 0.1513 | 0.8433 | 0.8420 |
838
+ | 1.5968 | 18200 | - | 0.1546 | 0.8420 | 0.8408 |
839
+ | 1.6012 | 18250 | 0.2458 | 0.1529 | 0.8414 | 0.8398 |
840
+ | 1.6055 | 18300 | - | 0.1580 | 0.8414 | 0.8391 |
841
+ | 1.6099 | 18350 | - | 0.1483 | 0.8389 | 0.8363 |
842
+ | 1.6143 | 18400 | - | 0.1501 | 0.8419 | 0.8405 |
843
+ | 1.6187 | 18450 | - | 0.1488 | 0.8413 | 0.8388 |
844
+ | 1.6231 | 18500 | 0.2532 | 0.1499 | 0.8418 | 0.8410 |
845
+ | 1.6275 | 18550 | - | 0.1520 | 0.8409 | 0.8408 |
846
+ | 1.6319 | 18600 | - | 0.1521 | 0.8407 | 0.8392 |
847
+ | 1.6363 | 18650 | - | 0.1459 | 0.8402 | 0.8382 |
848
+ | 1.6406 | 18700 | - | 0.1556 | 0.8433 | 0.8427 |
849
+ | 1.6450 | 18750 | 0.24 | 0.1501 | 0.8421 | 0.8410 |
850
+ | 1.6494 | 18800 | - | 0.1485 | 0.8439 | 0.8425 |
851
+ | 1.6538 | 18850 | - | 0.1526 | 0.8412 | 0.8406 |
852
+ | 1.6582 | 18900 | - | 0.1522 | 0.8422 | 0.8425 |
853
+ | 1.6626 | 18950 | - | 0.1456 | 0.8406 | 0.8390 |
854
+ | 1.6670 | 19000 | 0.2404 | 0.1483 | 0.8412 | 0.8408 |
855
+ | 1.6713 | 19050 | - | 0.1550 | 0.8424 | 0.8428 |
856
+ | 1.6757 | 19100 | - | 0.1493 | 0.8387 | 0.8384 |
857
+ | 1.6801 | 19150 | - | 0.1523 | 0.8391 | 0.8379 |
858
+ | 1.6845 | 19200 | - | 0.1512 | 0.8366 | 0.8343 |
859
+ | 1.6889 | 19250 | 0.2401 | 0.1506 | 0.8372 | 0.8348 |
860
+ | 1.6933 | 19300 | - | 0.1457 | 0.8375 | 0.8343 |
861
+ | 1.6977 | 19350 | - | 0.1500 | 0.8403 | 0.8379 |
862
+ | 1.7021 | 19400 | - | 0.1464 | 0.8380 | 0.8367 |
863
+ | 1.7064 | 19450 | - | 0.1485 | 0.8403 | 0.8397 |
864
+ | 1.7108 | 19500 | 0.2329 | 0.1469 | 0.8450 | 0.8417 |
865
+ | 1.7152 | 19550 | - | 0.1498 | 0.8418 | 0.8391 |
866
+ | 1.7196 | 19600 | - | 0.1427 | 0.8394 | 0.8384 |
867
+ | 1.7240 | 19650 | - | 0.1493 | 0.8399 | 0.8392 |
868
+ | 1.7284 | 19700 | - | 0.1487 | 0.8423 | 0.8406 |
869
+ | 1.7328 | 19750 | 0.2397 | 0.1464 | 0.8420 | 0.8398 |
870
+ | 1.7371 | 19800 | - | 0.1511 | 0.8433 | 0.8406 |
871
+ | 1.7415 | 19850 | - | 0.1502 | 0.8391 | 0.8365 |
872
+ | 1.7459 | 19900 | - | 0.1527 | 0.8404 | 0.8386 |
873
+ | 1.7503 | 19950 | - | 0.1498 | 0.8397 | 0.8390 |
874
+ | 1.7547 | 20000 | 0.2312 | 0.1505 | 0.8413 | 0.8389 |
875
+ | 1.7591 | 20050 | - | 0.1525 | 0.8411 | 0.8396 |
876
+ | 1.7635 | 20100 | - | 0.1491 | 0.8380 | 0.8370 |
877
+ | 1.7679 | 20150 | - | 0.1431 | 0.8395 | 0.8382 |
878
+ | 1.7722 | 20200 | - | 0.1451 | 0.8365 | 0.8352 |
879
+ | 1.7766 | 20250 | 0.2319 | 0.1485 | 0.8388 | 0.8366 |
880
+ | 1.7810 | 20300 | - | 0.1499 | 0.8376 | 0.8367 |
881
+ | 1.7854 | 20350 | - | 0.1448 | 0.8364 | 0.8349 |
882
+ | 1.7898 | 20400 | - | 0.1485 | 0.8346 | 0.8328 |
883
+ | 1.7942 | 20450 | - | 0.1470 | 0.8376 | 0.8364 |
884
+ | 1.7986 | 20500 | 0.2295 | 0.1471 | 0.8386 | 0.8363 |
885
+ | 1.8029 | 20550 | - | 0.1501 | 0.8351 | 0.8329 |
886
+ | 1.8073 | 20600 | - | 0.1494 | 0.8382 | 0.8364 |
887
+ | 1.8117 | 20650 | - | 0.1489 | 0.8405 | 0.8386 |
888
+ | 1.8161 | 20700 | - | 0.1465 | 0.8381 | 0.8372 |
889
+ | 1.8205 | 20750 | 0.2408 | 0.1435 | 0.8398 | 0.8390 |
890
+ | 1.8249 | 20800 | - | 0.1498 | 0.8449 | 0.8431 |
891
+ | 1.8293 | 20850 | - | 0.1487 | 0.8431 | 0.8416 |
892
+ | 1.8337 | 20900 | - | 0.1456 | 0.8419 | 0.8394 |
893
+ | 1.8380 | 20950 | - | 0.1437 | 0.8423 | 0.8408 |
894
+ | 1.8424 | 21000 | 0.2374 | 0.1408 | 0.8425 | 0.8414 |
895
+ | 1.8468 | 21050 | - | 0.1434 | 0.8434 | 0.8418 |
896
+ | 1.8512 | 21100 | - | 0.1486 | 0.8422 | 0.8403 |
897
+ | 1.8556 | 21150 | - | 0.1467 | 0.8429 | 0.8421 |
898
+ | 1.8600 | 21200 | - | 0.1458 | 0.8409 | 0.8402 |
899
+ | 1.8644 | 21250 | 0.2385 | 0.1449 | 0.8411 | 0.8395 |
900
+ | 1.8687 | 21300 | - | 0.1415 | 0.8401 | 0.8390 |
901
+ | 1.8731 | 21350 | - | 0.1462 | 0.8417 | 0.8403 |
902
+ | 1.8775 | 21400 | - | 0.1468 | 0.8423 | 0.8403 |
903
+ | 1.8819 | 21450 | - | 0.1459 | 0.8417 | 0.8394 |
904
+ | 1.8863 | 21500 | 0.2302 | 0.1466 | 0.8396 | 0.8372 |
905
+ | 1.8907 | 21550 | - | 0.1479 | 0.8391 | 0.8363 |
906
+ | 1.8951 | 21600 | - | 0.1407 | 0.8382 | 0.8365 |
907
+ | 1.8995 | 21650 | - | 0.1462 | 0.8377 | 0.8355 |
908
+ | 1.9038 | 21700 | - | 0.1438 | 0.8348 | 0.8343 |
909
+ | 1.9082 | 21750 | 0.2383 | 0.1451 | 0.8371 | 0.8363 |
910
+ | 1.9126 | 21800 | - | 0.1448 | 0.8375 | 0.8360 |
911
+ | 1.9170 | 21850 | - | 0.1389 | 0.8383 | 0.8377 |
912
+ | 1.9214 | 21900 | - | 0.1409 | 0.8379 | 0.8367 |
913
+ | 1.9258 | 21950 | - | 0.1397 | 0.8374 | 0.8352 |
914
+ | 1.9302 | 22000 | 0.2321 | 0.1408 | 0.8405 | 0.8385 |
915
+ | 1.9345 | 22050 | - | 0.1451 | 0.8381 | 0.8363 |
916
+ | 1.9389 | 22100 | - | 0.1467 | 0.8363 | 0.8353 |
917
+ | 1.9433 | 22150 | - | 0.1459 | 0.8352 | 0.8337 |
918
+ | 1.9477 | 22200 | - | 0.1431 | 0.8382 | 0.8355 |
919
+ | 1.9521 | 22250 | 0.2282 | 0.1457 | 0.8385 | 0.8371 |
920
+ | 1.9565 | 22300 | - | 0.1475 | 0.8364 | 0.8359 |
921
+ | 1.9609 | 22350 | - | 0.1483 | 0.8370 | 0.8336 |
922
+ | 1.9653 | 22400 | - | 0.1469 | 0.8406 | 0.8373 |
923
+ | 1.9696 | 22450 | - | 0.1430 | 0.8415 | 0.8391 |
924
+ | 1.9740 | 22500 | 0.2294 | 0.1471 | 0.8417 | 0.8399 |
925
+ | 1.9784 | 22550 | - | 0.1467 | 0.8414 | 0.8413 |
926
+ | 1.9828 | 22600 | - | 0.1464 | 0.8423 | 0.8410 |
927
+ | 1.9872 | 22650 | - | 0.1475 | 0.8431 | 0.8432 |
928
+ | 1.9916 | 22700 | - | 0.1476 | 0.8450 | 0.8442 |
929
+ | 1.9960 | 22750 | 0.2242 | 0.1463 | 0.8443 | 0.8418 |
930
+ | 2.0004 | 22800 | - | 0.1472 | 0.8422 | 0.8412 |
931
+ | 2.0047 | 22850 | - | 0.1506 | 0.8452 | 0.8435 |
932
+ | 2.0091 | 22900 | - | 0.1478 | 0.8463 | 0.8432 |
933
+ | 2.0135 | 22950 | - | 0.1536 | 0.8479 | 0.8454 |
934
+ | 2.0179 | 23000 | 0.2249 | 0.1487 | 0.8453 | 0.8422 |
935
+ | 2.0223 | 23050 | - | 0.1484 | 0.8430 | 0.8410 |
936
+ | 2.0267 | 23100 | - | 0.1524 | 0.8454 | 0.8440 |
937
+ | 2.0311 | 23150 | - | 0.1475 | 0.8450 | 0.8422 |
938
+ | 2.0354 | 23200 | - | 0.1533 | 0.8460 | 0.8435 |
939
+ | 2.0398 | 23250 | 0.2165 | 0.1551 | 0.8428 | 0.8410 |
940
+ | 2.0442 | 23300 | - | 0.1507 | 0.8425 | 0.8400 |
941
+ | 2.0486 | 23350 | - | 0.1517 | 0.8427 | 0.8410 |
942
+ | 2.0530 | 23400 | - | 0.1524 | 0.8404 | 0.8391 |
943
+ | 2.0574 | 23450 | - | 0.1515 | 0.8415 | 0.8408 |
944
+ | 2.0618 | 23500 | 0.2258 | 0.1500 | 0.8392 | 0.8384 |
945
+ | 2.0662 | 23550 | - | 0.1461 | 0.8387 | 0.8362 |
946
+ | 2.0705 | 23600 | - | 0.1429 | 0.8408 | 0.8378 |
947
+ | 2.0749 | 23650 | - | 0.1473 | 0.8410 | 0.8398 |
948
+ | 2.0793 | 23700 | - | 0.1474 | 0.8415 | 0.8402 |
949
+ | 2.0837 | 23750 | 0.2309 | 0.1479 | 0.8425 | 0.8408 |
950
+ | 2.0881 | 23800 | - | 0.1493 | 0.8427 | 0.8390 |
951
+ | 2.0925 | 23850 | - | 0.1469 | 0.8419 | 0.8394 |
952
+ | 2.0969 | 23900 | - | 0.1460 | 0.8426 | 0.8406 |
953
+ | 2.1012 | 23950 | - | 0.1502 | 0.8433 | 0.8418 |
954
+ | 2.1056 | 24000 | 0.2113 | 0.1462 | 0.8423 | 0.8406 |
955
+ | 2.1100 | 24050 | - | 0.1463 | 0.8429 | 0.8398 |
956
+ | 2.1144 | 24100 | - | 0.1459 | 0.8431 | 0.8400 |
957
+ | 2.1188 | 24150 | - | 0.1417 | 0.8403 | 0.8381 |
958
+ | 2.1232 | 24200 | - | 0.1396 | 0.8376 | 0.8371 |
959
+ | 2.1276 | 24250 | 0.2132 | 0.1419 | 0.8382 | 0.8380 |
960
+ | 2.1320 | 24300 | - | 0.1444 | 0.8378 | 0.8377 |
961
+ | 2.1363 | 24350 | - | 0.1399 | 0.8334 | 0.8342 |
962
+ | 2.1407 | 24400 | - | 0.1363 | 0.8382 | 0.8361 |
963
+ | 2.1451 | 24450 | - | 0.1379 | 0.8381 | 0.8369 |
964
+ | 2.1495 | 24500 | 0.2124 | 0.1421 | 0.8403 | 0.8391 |
965
+ | 2.1539 | 24550 | - | 0.1445 | 0.8399 | 0.8391 |
966
+ | 2.1583 | 24600 | - | 0.1452 | 0.8416 | 0.8401 |
967
+ | 2.1627 | 24650 | - | 0.1426 | 0.8411 | 0.8385 |
968
+ | 2.1670 | 24700 | - | 0.1447 | 0.8424 | 0.8407 |
969
+ | 2.1714 | 24750 | 0.2058 | 0.1460 | 0.8422 | 0.8413 |
970
+ | 2.1758 | 24800 | - | 0.1434 | 0.8422 | 0.8418 |
971
+ | 2.1802 | 24850 | - | 0.1443 | 0.8438 | 0.8416 |
972
+ | 2.1846 | 24900 | - | 0.1414 | 0.8422 | 0.8405 |
973
+ | 2.1890 | 24950 | - | 0.1437 | 0.8424 | 0.8407 |
974
+ | 2.1934 | 25000 | 0.2111 | 0.1466 | 0.8401 | 0.8394 |
975
+ | 2.1978 | 25050 | - | 0.1437 | 0.8390 | 0.8377 |
976
+ | 2.2021 | 25100 | - | 0.1446 | 0.8402 | 0.8394 |
977
+ | 2.2065 | 25150 | - | 0.1457 | 0.8394 | 0.8380 |
978
+ | 2.2109 | 25200 | - | 0.1432 | 0.8406 | 0.8380 |
979
+ | 2.2153 | 25250 | 0.2013 | 0.1464 | 0.8412 | 0.8397 |
980
+ | 2.2197 | 25300 | - | 0.1499 | 0.8419 | 0.8388 |
981
+ | 2.2241 | 25350 | - | 0.1466 | 0.8425 | 0.8402 |
982
+ | 2.2285 | 25400 | - | 0.1429 | 0.8424 | 0.8397 |
983
+ | 2.2328 | 25450 | - | 0.1433 | 0.8430 | 0.8404 |
984
+ | 2.2372 | 25500 | 0.2064 | 0.1472 | 0.8410 | 0.8404 |
985
+ | 2.2416 | 25550 | - | 0.1451 | 0.8406 | 0.8386 |
986
+ | 2.2460 | 25600 | - | 0.1480 | 0.8427 | 0.8419 |
987
+ | 2.2504 | 25650 | - | 0.1507 | 0.8409 | 0.8412 |
988
+ | 2.2548 | 25700 | - | 0.1488 | 0.8407 | 0.8398 |
989
+ | 2.2592 | 25750 | 0.2084 | 0.1476 | 0.8401 | 0.8392 |
990
+ | 2.2636 | 25800 | - | 0.1478 | 0.8403 | 0.8388 |
991
+ | 2.2679 | 25850 | - | 0.1509 | 0.8420 | 0.8417 |
992
+ | 2.2723 | 25900 | - | 0.1464 | 0.8417 | 0.8396 |
993
+ | 2.2767 | 25950 | - | 0.1469 | 0.8406 | 0.8388 |
994
+ | 2.2811 | 26000 | 0.2113 | 0.1470 | 0.8422 | 0.8404 |
995
+ | 2.2855 | 26050 | - | 0.1479 | 0.8414 | 0.8411 |
996
+ | 2.2899 | 26100 | - | 0.1488 | 0.8424 | 0.8418 |
997
+ | 2.2943 | 26150 | - | 0.1508 | 0.8429 | 0.8428 |
998
+ | 2.2986 | 26200 | - | 0.1507 | 0.8425 | 0.8422 |
999
+ | 2.3030 | 26250 | 0.2045 | 0.1496 | 0.8423 | 0.8416 |
1000
+
1001
+ </details>
1002
+
1003
+ ### Framework Versions
1004
+ - Python: 3.10.14
1005
+ - Sentence Transformers: 3.2.0
1006
+ - Transformers: 4.45.2
1007
+ - PyTorch: 2.3.1
1008
+ - Accelerate: 1.0.1
1009
+ - Datasets: 3.0.1
1010
+ - Tokenizers: 0.20.1
1011
+
1012
+ ## Citation
1013
+
1014
+ ### BibTeX
1015
+
1016
+ #### Sentence Transformers
1017
+ ```bibtex
1018
+ @inproceedings{reimers-2019-sentence-bert,
1019
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1020
+ author = "Reimers, Nils and Gurevych, Iryna",
1021
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1022
+ month = "11",
1023
+ year = "2019",
1024
+ publisher = "Association for Computational Linguistics",
1025
+ url = "https://arxiv.org/abs/1908.10084",
1026
+ }
1027
+ ```
1028
+
1029
+ #### MatryoshkaLoss
1030
+ ```bibtex
1031
+ @misc{kusupati2024matryoshka,
1032
+ title={Matryoshka Representation Learning},
1033
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
1034
+ year={2024},
1035
+ eprint={2205.13147},
1036
+ archivePrefix={arXiv},
1037
+ primaryClass={cs.LG}
1038
+ }
1039
+ ```
1040
+
1041
+ #### MultipleNegativesRankingLoss
1042
+ ```bibtex
1043
+ @misc{henderson2017efficient,
1044
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
1045
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
1046
+ year={2017},
1047
+ eprint={1705.00652},
1048
+ archivePrefix={arXiv},
1049
+ primaryClass={cs.CL}
1050
+ }
1051
+ ```
1052
+
1053
+ <!--
1054
+ ## Glossary
1055
+
1056
+ *Clearly define terms in order to be accessible across audiences.*
1057
+ -->
1058
+
1059
+ <!--
1060
+ ## Model Card Authors
1061
+
1062
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1063
+ -->
1064
+
1065
+ <!--
1066
+ ## Model Card Contact
1067
+
1068
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1069
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/workspace/v3-matryoshka_aubmindlab-bert-base-arabertv02-2024-10-12_13-55-06/checkpoint-26250",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 768,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 3072,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 12,
17
+ "num_hidden_layers": 12,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "torch_dtype": "float32",
21
+ "transformers_version": "4.45.2",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 64000
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.2.0",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.3.1"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c15b891f76f7e12add432f68cf9e51c200ddd9179fa6435324737f81063eb5b4
3
+ size 540795752
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "5": {
44
+ "content": "[رابط]",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": true,
49
+ "special": true
50
+ },
51
+ "6": {
52
+ "content": "[بريد]",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": true,
57
+ "special": true
58
+ },
59
+ "7": {
60
+ "content": "[مستخدم]",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": true,
65
+ "special": true
66
+ }
67
+ },
68
+ "clean_up_tokenization_spaces": false,
69
+ "cls_token": "[CLS]",
70
+ "do_basic_tokenize": true,
71
+ "do_lower_case": false,
72
+ "mask_token": "[MASK]",
73
+ "max_len": 512,
74
+ "max_length": 512,
75
+ "model_max_length": 512,
76
+ "never_split": [
77
+ "[بريد]",
78
+ "[مستخدم]",
79
+ "[رابط]"
80
+ ],
81
+ "pad_to_multiple_of": null,
82
+ "pad_token": "[PAD]",
83
+ "pad_token_type_id": 0,
84
+ "padding_side": "right",
85
+ "sep_token": "[SEP]",
86
+ "stride": 0,
87
+ "strip_accents": null,
88
+ "tokenize_chinese_chars": true,
89
+ "tokenizer_class": "BertTokenizer",
90
+ "truncation_side": "right",
91
+ "truncation_strategy": "longest_first",
92
+ "unk_token": "[UNK]"
93
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff