martinhillebrandtd commited on
Commit
2f8ac1d
·
1 Parent(s): b7318ba
README.md CHANGED
@@ -1,3 +1,2653 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ model-index:
6
+ - name: bge-large-en-v1.5
7
+ results:
8
+ - dataset:
9
+ config: en
10
+ name: MTEB AmazonCounterfactualClassification (en)
11
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
12
+ split: test
13
+ type: mteb/amazon_counterfactual
14
+ metrics:
15
+ - type: accuracy
16
+ value: 75.8507462686567
17
+ - type: ap
18
+ value: 38.566457320228245
19
+ - type: f1
20
+ value: 69.69386648043475
21
+ task:
22
+ type: Classification
23
+ - dataset:
24
+ config: default
25
+ name: MTEB AmazonPolarityClassification
26
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
27
+ split: test
28
+ type: mteb/amazon_polarity
29
+ metrics:
30
+ - type: accuracy
31
+ value: 92.416675
32
+ - type: ap
33
+ value: 89.1928861155922
34
+ - type: f1
35
+ value: 92.39477019574215
36
+ task:
37
+ type: Classification
38
+ - dataset:
39
+ config: en
40
+ name: MTEB AmazonReviewsClassification (en)
41
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
42
+ split: test
43
+ type: mteb/amazon_reviews_multi
44
+ metrics:
45
+ - type: accuracy
46
+ value: 48.175999999999995
47
+ - type: f1
48
+ value: 47.80712792870253
49
+ task:
50
+ type: Classification
51
+ - dataset:
52
+ config: default
53
+ name: MTEB ArguAna
54
+ revision: None
55
+ split: test
56
+ type: arguana
57
+ metrics:
58
+ - type: map_at_1
59
+ value: 40.184999999999995
60
+ - type: map_at_10
61
+ value: 55.654
62
+ - type: map_at_100
63
+ value: 56.25
64
+ - type: map_at_1000
65
+ value: 56.255
66
+ - type: map_at_3
67
+ value: 51.742999999999995
68
+ - type: map_at_5
69
+ value: 54.129000000000005
70
+ - type: mrr_at_1
71
+ value: 40.967
72
+ - type: mrr_at_10
73
+ value: 55.96
74
+ - type: mrr_at_100
75
+ value: 56.54900000000001
76
+ - type: mrr_at_1000
77
+ value: 56.554
78
+ - type: mrr_at_3
79
+ value: 51.980000000000004
80
+ - type: mrr_at_5
81
+ value: 54.44
82
+ - type: ndcg_at_1
83
+ value: 40.184999999999995
84
+ - type: ndcg_at_10
85
+ value: 63.542
86
+ - type: ndcg_at_100
87
+ value: 65.96499999999999
88
+ - type: ndcg_at_1000
89
+ value: 66.08699999999999
90
+ - type: ndcg_at_3
91
+ value: 55.582
92
+ - type: ndcg_at_5
93
+ value: 59.855000000000004
94
+ - type: precision_at_1
95
+ value: 40.184999999999995
96
+ - type: precision_at_10
97
+ value: 8.841000000000001
98
+ - type: precision_at_100
99
+ value: 0.987
100
+ - type: precision_at_1000
101
+ value: 0.1
102
+ - type: precision_at_3
103
+ value: 22.238
104
+ - type: precision_at_5
105
+ value: 15.405
106
+ - type: recall_at_1
107
+ value: 40.184999999999995
108
+ - type: recall_at_10
109
+ value: 88.407
110
+ - type: recall_at_100
111
+ value: 98.72
112
+ - type: recall_at_1000
113
+ value: 99.644
114
+ - type: recall_at_3
115
+ value: 66.714
116
+ - type: recall_at_5
117
+ value: 77.027
118
+ task:
119
+ type: Retrieval
120
+ - dataset:
121
+ config: default
122
+ name: MTEB ArxivClusteringP2P
123
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
124
+ split: test
125
+ type: mteb/arxiv-clustering-p2p
126
+ metrics:
127
+ - type: v_measure
128
+ value: 48.567077926750066
129
+ task:
130
+ type: Clustering
131
+ - dataset:
132
+ config: default
133
+ name: MTEB ArxivClusteringS2S
134
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
135
+ split: test
136
+ type: mteb/arxiv-clustering-s2s
137
+ metrics:
138
+ - type: v_measure
139
+ value: 43.19453389182364
140
+ task:
141
+ type: Clustering
142
+ - dataset:
143
+ config: default
144
+ name: MTEB AskUbuntuDupQuestions
145
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
146
+ split: test
147
+ type: mteb/askubuntudupquestions-reranking
148
+ metrics:
149
+ - type: map
150
+ value: 64.46555939623092
151
+ - type: mrr
152
+ value: 77.82361605768807
153
+ task:
154
+ type: Reranking
155
+ - dataset:
156
+ config: default
157
+ name: MTEB BIOSSES
158
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
159
+ split: test
160
+ type: mteb/biosses-sts
161
+ metrics:
162
+ - type: cos_sim_pearson
163
+ value: 84.9554128814735
164
+ - type: cos_sim_spearman
165
+ value: 84.65373612172036
166
+ - type: euclidean_pearson
167
+ value: 83.2905059954138
168
+ - type: euclidean_spearman
169
+ value: 84.52240782811128
170
+ - type: manhattan_pearson
171
+ value: 82.99533802997436
172
+ - type: manhattan_spearman
173
+ value: 84.20673798475734
174
+ task:
175
+ type: STS
176
+ - dataset:
177
+ config: default
178
+ name: MTEB Banking77Classification
179
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
180
+ split: test
181
+ type: mteb/banking77
182
+ metrics:
183
+ - type: accuracy
184
+ value: 87.78896103896103
185
+ - type: f1
186
+ value: 87.77189310964883
187
+ task:
188
+ type: Classification
189
+ - dataset:
190
+ config: default
191
+ name: MTEB BiorxivClusteringP2P
192
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
193
+ split: test
194
+ type: mteb/biorxiv-clustering-p2p
195
+ metrics:
196
+ - type: v_measure
197
+ value: 39.714538337650495
198
+ task:
199
+ type: Clustering
200
+ - dataset:
201
+ config: default
202
+ name: MTEB BiorxivClusteringS2S
203
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
204
+ split: test
205
+ type: mteb/biorxiv-clustering-s2s
206
+ metrics:
207
+ - type: v_measure
208
+ value: 36.90108349284447
209
+ task:
210
+ type: Clustering
211
+ - dataset:
212
+ config: default
213
+ name: MTEB CQADupstackAndroidRetrieval
214
+ revision: None
215
+ split: test
216
+ type: BeIR/cqadupstack
217
+ metrics:
218
+ - type: map_at_1
219
+ value: 32.795
220
+ - type: map_at_10
221
+ value: 43.669000000000004
222
+ - type: map_at_100
223
+ value: 45.151
224
+ - type: map_at_1000
225
+ value: 45.278
226
+ - type: map_at_3
227
+ value: 40.006
228
+ - type: map_at_5
229
+ value: 42.059999999999995
230
+ - type: mrr_at_1
231
+ value: 39.771
232
+ - type: mrr_at_10
233
+ value: 49.826
234
+ - type: mrr_at_100
235
+ value: 50.504000000000005
236
+ - type: mrr_at_1000
237
+ value: 50.549
238
+ - type: mrr_at_3
239
+ value: 47.115
240
+ - type: mrr_at_5
241
+ value: 48.832
242
+ - type: ndcg_at_1
243
+ value: 39.771
244
+ - type: ndcg_at_10
245
+ value: 50.217999999999996
246
+ - type: ndcg_at_100
247
+ value: 55.454
248
+ - type: ndcg_at_1000
249
+ value: 57.37
250
+ - type: ndcg_at_3
251
+ value: 44.885000000000005
252
+ - type: ndcg_at_5
253
+ value: 47.419
254
+ - type: precision_at_1
255
+ value: 39.771
256
+ - type: precision_at_10
257
+ value: 9.642000000000001
258
+ - type: precision_at_100
259
+ value: 1.538
260
+ - type: precision_at_1000
261
+ value: 0.198
262
+ - type: precision_at_3
263
+ value: 21.268
264
+ - type: precision_at_5
265
+ value: 15.536
266
+ - type: recall_at_1
267
+ value: 32.795
268
+ - type: recall_at_10
269
+ value: 62.580999999999996
270
+ - type: recall_at_100
271
+ value: 84.438
272
+ - type: recall_at_1000
273
+ value: 96.492
274
+ - type: recall_at_3
275
+ value: 47.071000000000005
276
+ - type: recall_at_5
277
+ value: 54.079
278
+ - type: map_at_1
279
+ value: 32.671
280
+ - type: map_at_10
281
+ value: 43.334
282
+ - type: map_at_100
283
+ value: 44.566
284
+ - type: map_at_1000
285
+ value: 44.702999999999996
286
+ - type: map_at_3
287
+ value: 40.343
288
+ - type: map_at_5
289
+ value: 41.983
290
+ - type: mrr_at_1
291
+ value: 40.764
292
+ - type: mrr_at_10
293
+ value: 49.382
294
+ - type: mrr_at_100
295
+ value: 49.988
296
+ - type: mrr_at_1000
297
+ value: 50.03300000000001
298
+ - type: mrr_at_3
299
+ value: 47.293
300
+ - type: mrr_at_5
301
+ value: 48.51
302
+ - type: ndcg_at_1
303
+ value: 40.764
304
+ - type: ndcg_at_10
305
+ value: 49.039
306
+ - type: ndcg_at_100
307
+ value: 53.259
308
+ - type: ndcg_at_1000
309
+ value: 55.253
310
+ - type: ndcg_at_3
311
+ value: 45.091
312
+ - type: ndcg_at_5
313
+ value: 46.839999999999996
314
+ - type: precision_at_1
315
+ value: 40.764
316
+ - type: precision_at_10
317
+ value: 9.191
318
+ - type: precision_at_100
319
+ value: 1.476
320
+ - type: precision_at_1000
321
+ value: 0.19499999999999998
322
+ - type: precision_at_3
323
+ value: 21.72
324
+ - type: precision_at_5
325
+ value: 15.299
326
+ - type: recall_at_1
327
+ value: 32.671
328
+ - type: recall_at_10
329
+ value: 58.816
330
+ - type: recall_at_100
331
+ value: 76.654
332
+ - type: recall_at_1000
333
+ value: 89.05999999999999
334
+ - type: recall_at_3
335
+ value: 46.743
336
+ - type: recall_at_5
337
+ value: 51.783
338
+ - type: map_at_1
339
+ value: 40.328
340
+ - type: map_at_10
341
+ value: 53.32599999999999
342
+ - type: map_at_100
343
+ value: 54.37499999999999
344
+ - type: map_at_1000
345
+ value: 54.429
346
+ - type: map_at_3
347
+ value: 49.902
348
+ - type: map_at_5
349
+ value: 52.002
350
+ - type: mrr_at_1
351
+ value: 46.332
352
+ - type: mrr_at_10
353
+ value: 56.858
354
+ - type: mrr_at_100
355
+ value: 57.522
356
+ - type: mrr_at_1000
357
+ value: 57.54899999999999
358
+ - type: mrr_at_3
359
+ value: 54.472
360
+ - type: mrr_at_5
361
+ value: 55.996
362
+ - type: ndcg_at_1
363
+ value: 46.332
364
+ - type: ndcg_at_10
365
+ value: 59.313
366
+ - type: ndcg_at_100
367
+ value: 63.266999999999996
368
+ - type: ndcg_at_1000
369
+ value: 64.36
370
+ - type: ndcg_at_3
371
+ value: 53.815000000000005
372
+ - type: ndcg_at_5
373
+ value: 56.814
374
+ - type: precision_at_1
375
+ value: 46.332
376
+ - type: precision_at_10
377
+ value: 9.53
378
+ - type: precision_at_100
379
+ value: 1.238
380
+ - type: precision_at_1000
381
+ value: 0.13699999999999998
382
+ - type: precision_at_3
383
+ value: 24.054000000000002
384
+ - type: precision_at_5
385
+ value: 16.589000000000002
386
+ - type: recall_at_1
387
+ value: 40.328
388
+ - type: recall_at_10
389
+ value: 73.421
390
+ - type: recall_at_100
391
+ value: 90.059
392
+ - type: recall_at_1000
393
+ value: 97.81
394
+ - type: recall_at_3
395
+ value: 59.009
396
+ - type: recall_at_5
397
+ value: 66.352
398
+ - type: map_at_1
399
+ value: 27.424
400
+ - type: map_at_10
401
+ value: 36.332
402
+ - type: map_at_100
403
+ value: 37.347
404
+ - type: map_at_1000
405
+ value: 37.422
406
+ - type: map_at_3
407
+ value: 33.743
408
+ - type: map_at_5
409
+ value: 35.176
410
+ - type: mrr_at_1
411
+ value: 29.153000000000002
412
+ - type: mrr_at_10
413
+ value: 38.233
414
+ - type: mrr_at_100
415
+ value: 39.109
416
+ - type: mrr_at_1000
417
+ value: 39.164
418
+ - type: mrr_at_3
419
+ value: 35.876000000000005
420
+ - type: mrr_at_5
421
+ value: 37.169000000000004
422
+ - type: ndcg_at_1
423
+ value: 29.153000000000002
424
+ - type: ndcg_at_10
425
+ value: 41.439
426
+ - type: ndcg_at_100
427
+ value: 46.42
428
+ - type: ndcg_at_1000
429
+ value: 48.242000000000004
430
+ - type: ndcg_at_3
431
+ value: 36.362
432
+ - type: ndcg_at_5
433
+ value: 38.743
434
+ - type: precision_at_1
435
+ value: 29.153000000000002
436
+ - type: precision_at_10
437
+ value: 6.315999999999999
438
+ - type: precision_at_100
439
+ value: 0.927
440
+ - type: precision_at_1000
441
+ value: 0.11199999999999999
442
+ - type: precision_at_3
443
+ value: 15.443000000000001
444
+ - type: precision_at_5
445
+ value: 10.644
446
+ - type: recall_at_1
447
+ value: 27.424
448
+ - type: recall_at_10
449
+ value: 55.364000000000004
450
+ - type: recall_at_100
451
+ value: 78.211
452
+ - type: recall_at_1000
453
+ value: 91.74600000000001
454
+ - type: recall_at_3
455
+ value: 41.379
456
+ - type: recall_at_5
457
+ value: 47.14
458
+ - type: map_at_1
459
+ value: 19.601
460
+ - type: map_at_10
461
+ value: 27.826
462
+ - type: map_at_100
463
+ value: 29.017
464
+ - type: map_at_1000
465
+ value: 29.137
466
+ - type: map_at_3
467
+ value: 25.125999999999998
468
+ - type: map_at_5
469
+ value: 26.765
470
+ - type: mrr_at_1
471
+ value: 24.005000000000003
472
+ - type: mrr_at_10
473
+ value: 32.716
474
+ - type: mrr_at_100
475
+ value: 33.631
476
+ - type: mrr_at_1000
477
+ value: 33.694
478
+ - type: mrr_at_3
479
+ value: 29.934
480
+ - type: mrr_at_5
481
+ value: 31.630999999999997
482
+ - type: ndcg_at_1
483
+ value: 24.005000000000003
484
+ - type: ndcg_at_10
485
+ value: 33.158
486
+ - type: ndcg_at_100
487
+ value: 38.739000000000004
488
+ - type: ndcg_at_1000
489
+ value: 41.495
490
+ - type: ndcg_at_3
491
+ value: 28.185
492
+ - type: ndcg_at_5
493
+ value: 30.796
494
+ - type: precision_at_1
495
+ value: 24.005000000000003
496
+ - type: precision_at_10
497
+ value: 5.908
498
+ - type: precision_at_100
499
+ value: 1.005
500
+ - type: precision_at_1000
501
+ value: 0.13899999999999998
502
+ - type: precision_at_3
503
+ value: 13.391
504
+ - type: precision_at_5
505
+ value: 9.876
506
+ - type: recall_at_1
507
+ value: 19.601
508
+ - type: recall_at_10
509
+ value: 44.746
510
+ - type: recall_at_100
511
+ value: 68.82300000000001
512
+ - type: recall_at_1000
513
+ value: 88.215
514
+ - type: recall_at_3
515
+ value: 31.239
516
+ - type: recall_at_5
517
+ value: 37.695
518
+ - type: map_at_1
519
+ value: 30.130000000000003
520
+ - type: map_at_10
521
+ value: 40.96
522
+ - type: map_at_100
523
+ value: 42.282
524
+ - type: map_at_1000
525
+ value: 42.392
526
+ - type: map_at_3
527
+ value: 37.889
528
+ - type: map_at_5
529
+ value: 39.661
530
+ - type: mrr_at_1
531
+ value: 36.958999999999996
532
+ - type: mrr_at_10
533
+ value: 46.835
534
+ - type: mrr_at_100
535
+ value: 47.644
536
+ - type: mrr_at_1000
537
+ value: 47.688
538
+ - type: mrr_at_3
539
+ value: 44.562000000000005
540
+ - type: mrr_at_5
541
+ value: 45.938
542
+ - type: ndcg_at_1
543
+ value: 36.958999999999996
544
+ - type: ndcg_at_10
545
+ value: 47.06
546
+ - type: ndcg_at_100
547
+ value: 52.345
548
+ - type: ndcg_at_1000
549
+ value: 54.35
550
+ - type: ndcg_at_3
551
+ value: 42.301
552
+ - type: ndcg_at_5
553
+ value: 44.635999999999996
554
+ - type: precision_at_1
555
+ value: 36.958999999999996
556
+ - type: precision_at_10
557
+ value: 8.479000000000001
558
+ - type: precision_at_100
559
+ value: 1.284
560
+ - type: precision_at_1000
561
+ value: 0.163
562
+ - type: precision_at_3
563
+ value: 20.244
564
+ - type: precision_at_5
565
+ value: 14.224999999999998
566
+ - type: recall_at_1
567
+ value: 30.130000000000003
568
+ - type: recall_at_10
569
+ value: 59.27
570
+ - type: recall_at_100
571
+ value: 81.195
572
+ - type: recall_at_1000
573
+ value: 94.21199999999999
574
+ - type: recall_at_3
575
+ value: 45.885
576
+ - type: recall_at_5
577
+ value: 52.016
578
+ - type: map_at_1
579
+ value: 26.169999999999998
580
+ - type: map_at_10
581
+ value: 36.451
582
+ - type: map_at_100
583
+ value: 37.791000000000004
584
+ - type: map_at_1000
585
+ value: 37.897
586
+ - type: map_at_3
587
+ value: 33.109
588
+ - type: map_at_5
589
+ value: 34.937000000000005
590
+ - type: mrr_at_1
591
+ value: 32.877
592
+ - type: mrr_at_10
593
+ value: 42.368
594
+ - type: mrr_at_100
595
+ value: 43.201
596
+ - type: mrr_at_1000
597
+ value: 43.259
598
+ - type: mrr_at_3
599
+ value: 39.763999999999996
600
+ - type: mrr_at_5
601
+ value: 41.260000000000005
602
+ - type: ndcg_at_1
603
+ value: 32.877
604
+ - type: ndcg_at_10
605
+ value: 42.659000000000006
606
+ - type: ndcg_at_100
607
+ value: 48.161
608
+ - type: ndcg_at_1000
609
+ value: 50.345
610
+ - type: ndcg_at_3
611
+ value: 37.302
612
+ - type: ndcg_at_5
613
+ value: 39.722
614
+ - type: precision_at_1
615
+ value: 32.877
616
+ - type: precision_at_10
617
+ value: 7.9
618
+ - type: precision_at_100
619
+ value: 1.236
620
+ - type: precision_at_1000
621
+ value: 0.158
622
+ - type: precision_at_3
623
+ value: 17.846
624
+ - type: precision_at_5
625
+ value: 12.9
626
+ - type: recall_at_1
627
+ value: 26.169999999999998
628
+ - type: recall_at_10
629
+ value: 55.35
630
+ - type: recall_at_100
631
+ value: 78.755
632
+ - type: recall_at_1000
633
+ value: 93.518
634
+ - type: recall_at_3
635
+ value: 40.176
636
+ - type: recall_at_5
637
+ value: 46.589000000000006
638
+ - type: map_at_1
639
+ value: 27.15516666666667
640
+ - type: map_at_10
641
+ value: 36.65741666666667
642
+ - type: map_at_100
643
+ value: 37.84991666666666
644
+ - type: map_at_1000
645
+ value: 37.96316666666667
646
+ - type: map_at_3
647
+ value: 33.74974999999999
648
+ - type: map_at_5
649
+ value: 35.3765
650
+ - type: mrr_at_1
651
+ value: 32.08233333333334
652
+ - type: mrr_at_10
653
+ value: 41.033833333333334
654
+ - type: mrr_at_100
655
+ value: 41.84524999999999
656
+ - type: mrr_at_1000
657
+ value: 41.89983333333333
658
+ - type: mrr_at_3
659
+ value: 38.62008333333333
660
+ - type: mrr_at_5
661
+ value: 40.03441666666666
662
+ - type: ndcg_at_1
663
+ value: 32.08233333333334
664
+ - type: ndcg_at_10
665
+ value: 42.229
666
+ - type: ndcg_at_100
667
+ value: 47.26716666666667
668
+ - type: ndcg_at_1000
669
+ value: 49.43466666666667
670
+ - type: ndcg_at_3
671
+ value: 37.36408333333333
672
+ - type: ndcg_at_5
673
+ value: 39.6715
674
+ - type: precision_at_1
675
+ value: 32.08233333333334
676
+ - type: precision_at_10
677
+ value: 7.382583333333334
678
+ - type: precision_at_100
679
+ value: 1.16625
680
+ - type: precision_at_1000
681
+ value: 0.15408333333333332
682
+ - type: precision_at_3
683
+ value: 17.218
684
+ - type: precision_at_5
685
+ value: 12.21875
686
+ - type: recall_at_1
687
+ value: 27.15516666666667
688
+ - type: recall_at_10
689
+ value: 54.36683333333333
690
+ - type: recall_at_100
691
+ value: 76.37183333333333
692
+ - type: recall_at_1000
693
+ value: 91.26183333333333
694
+ - type: recall_at_3
695
+ value: 40.769916666666674
696
+ - type: recall_at_5
697
+ value: 46.702333333333335
698
+ - type: map_at_1
699
+ value: 25.749
700
+ - type: map_at_10
701
+ value: 33.001999999999995
702
+ - type: map_at_100
703
+ value: 33.891
704
+ - type: map_at_1000
705
+ value: 33.993
706
+ - type: map_at_3
707
+ value: 30.703999999999997
708
+ - type: map_at_5
709
+ value: 31.959
710
+ - type: mrr_at_1
711
+ value: 28.834
712
+ - type: mrr_at_10
713
+ value: 35.955
714
+ - type: mrr_at_100
715
+ value: 36.709
716
+ - type: mrr_at_1000
717
+ value: 36.779
718
+ - type: mrr_at_3
719
+ value: 33.947
720
+ - type: mrr_at_5
721
+ value: 35.089
722
+ - type: ndcg_at_1
723
+ value: 28.834
724
+ - type: ndcg_at_10
725
+ value: 37.329
726
+ - type: ndcg_at_100
727
+ value: 41.79
728
+ - type: ndcg_at_1000
729
+ value: 44.169000000000004
730
+ - type: ndcg_at_3
731
+ value: 33.184999999999995
732
+ - type: ndcg_at_5
733
+ value: 35.107
734
+ - type: precision_at_1
735
+ value: 28.834
736
+ - type: precision_at_10
737
+ value: 5.7669999999999995
738
+ - type: precision_at_100
739
+ value: 0.876
740
+ - type: precision_at_1000
741
+ value: 0.11399999999999999
742
+ - type: precision_at_3
743
+ value: 14.213000000000001
744
+ - type: precision_at_5
745
+ value: 9.754999999999999
746
+ - type: recall_at_1
747
+ value: 25.749
748
+ - type: recall_at_10
749
+ value: 47.791
750
+ - type: recall_at_100
751
+ value: 68.255
752
+ - type: recall_at_1000
753
+ value: 85.749
754
+ - type: recall_at_3
755
+ value: 36.199
756
+ - type: recall_at_5
757
+ value: 41.071999999999996
758
+ - type: map_at_1
759
+ value: 17.777
760
+ - type: map_at_10
761
+ value: 25.201
762
+ - type: map_at_100
763
+ value: 26.423999999999996
764
+ - type: map_at_1000
765
+ value: 26.544
766
+ - type: map_at_3
767
+ value: 22.869
768
+ - type: map_at_5
769
+ value: 24.023
770
+ - type: mrr_at_1
771
+ value: 21.473
772
+ - type: mrr_at_10
773
+ value: 29.12
774
+ - type: mrr_at_100
775
+ value: 30.144
776
+ - type: mrr_at_1000
777
+ value: 30.215999999999998
778
+ - type: mrr_at_3
779
+ value: 26.933
780
+ - type: mrr_at_5
781
+ value: 28.051
782
+ - type: ndcg_at_1
783
+ value: 21.473
784
+ - type: ndcg_at_10
785
+ value: 30.003
786
+ - type: ndcg_at_100
787
+ value: 35.766
788
+ - type: ndcg_at_1000
789
+ value: 38.501000000000005
790
+ - type: ndcg_at_3
791
+ value: 25.773000000000003
792
+ - type: ndcg_at_5
793
+ value: 27.462999999999997
794
+ - type: precision_at_1
795
+ value: 21.473
796
+ - type: precision_at_10
797
+ value: 5.482
798
+ - type: precision_at_100
799
+ value: 0.975
800
+ - type: precision_at_1000
801
+ value: 0.13799999999999998
802
+ - type: precision_at_3
803
+ value: 12.205
804
+ - type: precision_at_5
805
+ value: 8.692
806
+ - type: recall_at_1
807
+ value: 17.777
808
+ - type: recall_at_10
809
+ value: 40.582
810
+ - type: recall_at_100
811
+ value: 66.305
812
+ - type: recall_at_1000
813
+ value: 85.636
814
+ - type: recall_at_3
815
+ value: 28.687
816
+ - type: recall_at_5
817
+ value: 33.089
818
+ - type: map_at_1
819
+ value: 26.677
820
+ - type: map_at_10
821
+ value: 36.309000000000005
822
+ - type: map_at_100
823
+ value: 37.403999999999996
824
+ - type: map_at_1000
825
+ value: 37.496
826
+ - type: map_at_3
827
+ value: 33.382
828
+ - type: map_at_5
829
+ value: 34.98
830
+ - type: mrr_at_1
831
+ value: 31.343
832
+ - type: mrr_at_10
833
+ value: 40.549
834
+ - type: mrr_at_100
835
+ value: 41.342
836
+ - type: mrr_at_1000
837
+ value: 41.397
838
+ - type: mrr_at_3
839
+ value: 38.029
840
+ - type: mrr_at_5
841
+ value: 39.451
842
+ - type: ndcg_at_1
843
+ value: 31.343
844
+ - type: ndcg_at_10
845
+ value: 42.1
846
+ - type: ndcg_at_100
847
+ value: 47.089999999999996
848
+ - type: ndcg_at_1000
849
+ value: 49.222
850
+ - type: ndcg_at_3
851
+ value: 36.836999999999996
852
+ - type: ndcg_at_5
853
+ value: 39.21
854
+ - type: precision_at_1
855
+ value: 31.343
856
+ - type: precision_at_10
857
+ value: 7.164
858
+ - type: precision_at_100
859
+ value: 1.0959999999999999
860
+ - type: precision_at_1000
861
+ value: 0.13899999999999998
862
+ - type: precision_at_3
863
+ value: 16.915
864
+ - type: precision_at_5
865
+ value: 11.940000000000001
866
+ - type: recall_at_1
867
+ value: 26.677
868
+ - type: recall_at_10
869
+ value: 55.54599999999999
870
+ - type: recall_at_100
871
+ value: 77.094
872
+ - type: recall_at_1000
873
+ value: 92.01
874
+ - type: recall_at_3
875
+ value: 41.191
876
+ - type: recall_at_5
877
+ value: 47.006
878
+ - type: map_at_1
879
+ value: 24.501
880
+ - type: map_at_10
881
+ value: 33.102
882
+ - type: map_at_100
883
+ value: 34.676
884
+ - type: map_at_1000
885
+ value: 34.888000000000005
886
+ - type: map_at_3
887
+ value: 29.944
888
+ - type: map_at_5
889
+ value: 31.613999999999997
890
+ - type: mrr_at_1
891
+ value: 29.447000000000003
892
+ - type: mrr_at_10
893
+ value: 37.996
894
+ - type: mrr_at_100
895
+ value: 38.946
896
+ - type: mrr_at_1000
897
+ value: 38.995000000000005
898
+ - type: mrr_at_3
899
+ value: 35.079
900
+ - type: mrr_at_5
901
+ value: 36.69
902
+ - type: ndcg_at_1
903
+ value: 29.447000000000003
904
+ - type: ndcg_at_10
905
+ value: 39.232
906
+ - type: ndcg_at_100
907
+ value: 45.247
908
+ - type: ndcg_at_1000
909
+ value: 47.613
910
+ - type: ndcg_at_3
911
+ value: 33.922999999999995
912
+ - type: ndcg_at_5
913
+ value: 36.284
914
+ - type: precision_at_1
915
+ value: 29.447000000000003
916
+ - type: precision_at_10
917
+ value: 7.648000000000001
918
+ - type: precision_at_100
919
+ value: 1.516
920
+ - type: precision_at_1000
921
+ value: 0.23900000000000002
922
+ - type: precision_at_3
923
+ value: 16.008
924
+ - type: precision_at_5
925
+ value: 11.779
926
+ - type: recall_at_1
927
+ value: 24.501
928
+ - type: recall_at_10
929
+ value: 51.18899999999999
930
+ - type: recall_at_100
931
+ value: 78.437
932
+ - type: recall_at_1000
933
+ value: 92.842
934
+ - type: recall_at_3
935
+ value: 35.808
936
+ - type: recall_at_5
937
+ value: 42.197
938
+ - type: map_at_1
939
+ value: 22.039
940
+ - type: map_at_10
941
+ value: 30.377
942
+ - type: map_at_100
943
+ value: 31.275
944
+ - type: map_at_1000
945
+ value: 31.379
946
+ - type: map_at_3
947
+ value: 27.98
948
+ - type: map_at_5
949
+ value: 29.358
950
+ - type: mrr_at_1
951
+ value: 24.03
952
+ - type: mrr_at_10
953
+ value: 32.568000000000005
954
+ - type: mrr_at_100
955
+ value: 33.403
956
+ - type: mrr_at_1000
957
+ value: 33.475
958
+ - type: mrr_at_3
959
+ value: 30.436999999999998
960
+ - type: mrr_at_5
961
+ value: 31.796000000000003
962
+ - type: ndcg_at_1
963
+ value: 24.03
964
+ - type: ndcg_at_10
965
+ value: 35.198
966
+ - type: ndcg_at_100
967
+ value: 39.668
968
+ - type: ndcg_at_1000
969
+ value: 42.296
970
+ - type: ndcg_at_3
971
+ value: 30.709999999999997
972
+ - type: ndcg_at_5
973
+ value: 33.024
974
+ - type: precision_at_1
975
+ value: 24.03
976
+ - type: precision_at_10
977
+ value: 5.564
978
+ - type: precision_at_100
979
+ value: 0.828
980
+ - type: precision_at_1000
981
+ value: 0.117
982
+ - type: precision_at_3
983
+ value: 13.309000000000001
984
+ - type: precision_at_5
985
+ value: 9.39
986
+ - type: recall_at_1
987
+ value: 22.039
988
+ - type: recall_at_10
989
+ value: 47.746
990
+ - type: recall_at_100
991
+ value: 68.23599999999999
992
+ - type: recall_at_1000
993
+ value: 87.852
994
+ - type: recall_at_3
995
+ value: 35.852000000000004
996
+ - type: recall_at_5
997
+ value: 41.410000000000004
998
+ task:
999
+ type: Retrieval
1000
+ - dataset:
1001
+ config: default
1002
+ name: MTEB ClimateFEVER
1003
+ revision: None
1004
+ split: test
1005
+ type: climate-fever
1006
+ metrics:
1007
+ - type: map_at_1
1008
+ value: 15.692999999999998
1009
+ - type: map_at_10
1010
+ value: 26.903
1011
+ - type: map_at_100
1012
+ value: 28.987000000000002
1013
+ - type: map_at_1000
1014
+ value: 29.176999999999996
1015
+ - type: map_at_3
1016
+ value: 22.137
1017
+ - type: map_at_5
1018
+ value: 24.758
1019
+ - type: mrr_at_1
1020
+ value: 35.57
1021
+ - type: mrr_at_10
1022
+ value: 47.821999999999996
1023
+ - type: mrr_at_100
1024
+ value: 48.608000000000004
1025
+ - type: mrr_at_1000
1026
+ value: 48.638999999999996
1027
+ - type: mrr_at_3
1028
+ value: 44.452000000000005
1029
+ - type: mrr_at_5
1030
+ value: 46.546
1031
+ - type: ndcg_at_1
1032
+ value: 35.57
1033
+ - type: ndcg_at_10
1034
+ value: 36.567
1035
+ - type: ndcg_at_100
1036
+ value: 44.085
1037
+ - type: ndcg_at_1000
1038
+ value: 47.24
1039
+ - type: ndcg_at_3
1040
+ value: 29.964000000000002
1041
+ - type: ndcg_at_5
1042
+ value: 32.511
1043
+ - type: precision_at_1
1044
+ value: 35.57
1045
+ - type: precision_at_10
1046
+ value: 11.485
1047
+ - type: precision_at_100
1048
+ value: 1.9619999999999997
1049
+ - type: precision_at_1000
1050
+ value: 0.256
1051
+ - type: precision_at_3
1052
+ value: 22.237000000000002
1053
+ - type: precision_at_5
1054
+ value: 17.471999999999998
1055
+ - type: recall_at_1
1056
+ value: 15.692999999999998
1057
+ - type: recall_at_10
1058
+ value: 43.056
1059
+ - type: recall_at_100
1060
+ value: 68.628
1061
+ - type: recall_at_1000
1062
+ value: 86.075
1063
+ - type: recall_at_3
1064
+ value: 26.918999999999997
1065
+ - type: recall_at_5
1066
+ value: 34.14
1067
+ task:
1068
+ type: Retrieval
1069
+ - dataset:
1070
+ config: default
1071
+ name: MTEB DBPedia
1072
+ revision: None
1073
+ split: test
1074
+ type: dbpedia-entity
1075
+ metrics:
1076
+ - type: map_at_1
1077
+ value: 9.53
1078
+ - type: map_at_10
1079
+ value: 20.951
1080
+ - type: map_at_100
1081
+ value: 30.136000000000003
1082
+ - type: map_at_1000
1083
+ value: 31.801000000000002
1084
+ - type: map_at_3
1085
+ value: 15.021
1086
+ - type: map_at_5
1087
+ value: 17.471999999999998
1088
+ - type: mrr_at_1
1089
+ value: 71.0
1090
+ - type: mrr_at_10
1091
+ value: 79.176
1092
+ - type: mrr_at_100
1093
+ value: 79.418
1094
+ - type: mrr_at_1000
1095
+ value: 79.426
1096
+ - type: mrr_at_3
1097
+ value: 78.125
1098
+ - type: mrr_at_5
1099
+ value: 78.61200000000001
1100
+ - type: ndcg_at_1
1101
+ value: 58.5
1102
+ - type: ndcg_at_10
1103
+ value: 44.106
1104
+ - type: ndcg_at_100
1105
+ value: 49.268
1106
+ - type: ndcg_at_1000
1107
+ value: 56.711999999999996
1108
+ - type: ndcg_at_3
1109
+ value: 48.934
1110
+ - type: ndcg_at_5
1111
+ value: 45.826
1112
+ - type: precision_at_1
1113
+ value: 71.0
1114
+ - type: precision_at_10
1115
+ value: 35.0
1116
+ - type: precision_at_100
1117
+ value: 11.360000000000001
1118
+ - type: precision_at_1000
1119
+ value: 2.046
1120
+ - type: precision_at_3
1121
+ value: 52.833
1122
+ - type: precision_at_5
1123
+ value: 44.15
1124
+ - type: recall_at_1
1125
+ value: 9.53
1126
+ - type: recall_at_10
1127
+ value: 26.811
1128
+ - type: recall_at_100
1129
+ value: 55.916999999999994
1130
+ - type: recall_at_1000
1131
+ value: 79.973
1132
+ - type: recall_at_3
1133
+ value: 16.413
1134
+ - type: recall_at_5
1135
+ value: 19.980999999999998
1136
+ task:
1137
+ type: Retrieval
1138
+ - dataset:
1139
+ config: default
1140
+ name: MTEB EmotionClassification
1141
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1142
+ split: test
1143
+ type: mteb/emotion
1144
+ metrics:
1145
+ - type: accuracy
1146
+ value: 51.519999999999996
1147
+ - type: f1
1148
+ value: 46.36601294761231
1149
+ task:
1150
+ type: Classification
1151
+ - dataset:
1152
+ config: default
1153
+ name: MTEB FEVER
1154
+ revision: None
1155
+ split: test
1156
+ type: fever
1157
+ metrics:
1158
+ - type: map_at_1
1159
+ value: 74.413
1160
+ - type: map_at_10
1161
+ value: 83.414
1162
+ - type: map_at_100
1163
+ value: 83.621
1164
+ - type: map_at_1000
1165
+ value: 83.635
1166
+ - type: map_at_3
1167
+ value: 82.337
1168
+ - type: map_at_5
1169
+ value: 83.039
1170
+ - type: mrr_at_1
1171
+ value: 80.19800000000001
1172
+ - type: mrr_at_10
1173
+ value: 87.715
1174
+ - type: mrr_at_100
1175
+ value: 87.778
1176
+ - type: mrr_at_1000
1177
+ value: 87.779
1178
+ - type: mrr_at_3
1179
+ value: 87.106
1180
+ - type: mrr_at_5
1181
+ value: 87.555
1182
+ - type: ndcg_at_1
1183
+ value: 80.19800000000001
1184
+ - type: ndcg_at_10
1185
+ value: 87.182
1186
+ - type: ndcg_at_100
1187
+ value: 87.90299999999999
1188
+ - type: ndcg_at_1000
1189
+ value: 88.143
1190
+ - type: ndcg_at_3
1191
+ value: 85.60600000000001
1192
+ - type: ndcg_at_5
1193
+ value: 86.541
1194
+ - type: precision_at_1
1195
+ value: 80.19800000000001
1196
+ - type: precision_at_10
1197
+ value: 10.531
1198
+ - type: precision_at_100
1199
+ value: 1.113
1200
+ - type: precision_at_1000
1201
+ value: 0.11499999999999999
1202
+ - type: precision_at_3
1203
+ value: 32.933
1204
+ - type: precision_at_5
1205
+ value: 20.429
1206
+ - type: recall_at_1
1207
+ value: 74.413
1208
+ - type: recall_at_10
1209
+ value: 94.363
1210
+ - type: recall_at_100
1211
+ value: 97.165
1212
+ - type: recall_at_1000
1213
+ value: 98.668
1214
+ - type: recall_at_3
1215
+ value: 90.108
1216
+ - type: recall_at_5
1217
+ value: 92.52
1218
+ task:
1219
+ type: Retrieval
1220
+ - dataset:
1221
+ config: default
1222
+ name: MTEB FiQA2018
1223
+ revision: None
1224
+ split: test
1225
+ type: fiqa
1226
+ metrics:
1227
+ - type: map_at_1
1228
+ value: 22.701
1229
+ - type: map_at_10
1230
+ value: 37.122
1231
+ - type: map_at_100
1232
+ value: 39.178000000000004
1233
+ - type: map_at_1000
1234
+ value: 39.326
1235
+ - type: map_at_3
1236
+ value: 32.971000000000004
1237
+ - type: map_at_5
1238
+ value: 35.332
1239
+ - type: mrr_at_1
1240
+ value: 44.753
1241
+ - type: mrr_at_10
1242
+ value: 53.452
1243
+ - type: mrr_at_100
1244
+ value: 54.198
1245
+ - type: mrr_at_1000
1246
+ value: 54.225
1247
+ - type: mrr_at_3
1248
+ value: 50.952
1249
+ - type: mrr_at_5
1250
+ value: 52.464
1251
+ - type: ndcg_at_1
1252
+ value: 44.753
1253
+ - type: ndcg_at_10
1254
+ value: 45.021
1255
+ - type: ndcg_at_100
1256
+ value: 52.028
1257
+ - type: ndcg_at_1000
1258
+ value: 54.596000000000004
1259
+ - type: ndcg_at_3
1260
+ value: 41.622
1261
+ - type: ndcg_at_5
1262
+ value: 42.736000000000004
1263
+ - type: precision_at_1
1264
+ value: 44.753
1265
+ - type: precision_at_10
1266
+ value: 12.284
1267
+ - type: precision_at_100
1268
+ value: 1.955
1269
+ - type: precision_at_1000
1270
+ value: 0.243
1271
+ - type: precision_at_3
1272
+ value: 27.828999999999997
1273
+ - type: precision_at_5
1274
+ value: 20.061999999999998
1275
+ - type: recall_at_1
1276
+ value: 22.701
1277
+ - type: recall_at_10
1278
+ value: 51.432
1279
+ - type: recall_at_100
1280
+ value: 77.009
1281
+ - type: recall_at_1000
1282
+ value: 92.511
1283
+ - type: recall_at_3
1284
+ value: 37.919000000000004
1285
+ - type: recall_at_5
1286
+ value: 44.131
1287
+ task:
1288
+ type: Retrieval
1289
+ - dataset:
1290
+ config: default
1291
+ name: MTEB HotpotQA
1292
+ revision: None
1293
+ split: test
1294
+ type: hotpotqa
1295
+ metrics:
1296
+ - type: map_at_1
1297
+ value: 40.189
1298
+ - type: map_at_10
1299
+ value: 66.24600000000001
1300
+ - type: map_at_100
1301
+ value: 67.098
1302
+ - type: map_at_1000
1303
+ value: 67.149
1304
+ - type: map_at_3
1305
+ value: 62.684
1306
+ - type: map_at_5
1307
+ value: 64.974
1308
+ - type: mrr_at_1
1309
+ value: 80.378
1310
+ - type: mrr_at_10
1311
+ value: 86.127
1312
+ - type: mrr_at_100
1313
+ value: 86.29299999999999
1314
+ - type: mrr_at_1000
1315
+ value: 86.297
1316
+ - type: mrr_at_3
1317
+ value: 85.31400000000001
1318
+ - type: mrr_at_5
1319
+ value: 85.858
1320
+ - type: ndcg_at_1
1321
+ value: 80.378
1322
+ - type: ndcg_at_10
1323
+ value: 74.101
1324
+ - type: ndcg_at_100
1325
+ value: 76.993
1326
+ - type: ndcg_at_1000
1327
+ value: 77.948
1328
+ - type: ndcg_at_3
1329
+ value: 69.232
1330
+ - type: ndcg_at_5
1331
+ value: 72.04599999999999
1332
+ - type: precision_at_1
1333
+ value: 80.378
1334
+ - type: precision_at_10
1335
+ value: 15.595999999999998
1336
+ - type: precision_at_100
1337
+ value: 1.7840000000000003
1338
+ - type: precision_at_1000
1339
+ value: 0.191
1340
+ - type: precision_at_3
1341
+ value: 44.884
1342
+ - type: precision_at_5
1343
+ value: 29.145
1344
+ - type: recall_at_1
1345
+ value: 40.189
1346
+ - type: recall_at_10
1347
+ value: 77.981
1348
+ - type: recall_at_100
1349
+ value: 89.21
1350
+ - type: recall_at_1000
1351
+ value: 95.48299999999999
1352
+ - type: recall_at_3
1353
+ value: 67.326
1354
+ - type: recall_at_5
1355
+ value: 72.863
1356
+ task:
1357
+ type: Retrieval
1358
+ - dataset:
1359
+ config: default
1360
+ name: MTEB ImdbClassification
1361
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1362
+ split: test
1363
+ type: mteb/imdb
1364
+ metrics:
1365
+ - type: accuracy
1366
+ value: 92.84599999999999
1367
+ - type: ap
1368
+ value: 89.4710787567357
1369
+ - type: f1
1370
+ value: 92.83752676932258
1371
+ task:
1372
+ type: Classification
1373
+ - dataset:
1374
+ config: default
1375
+ name: MTEB MSMARCO
1376
+ revision: None
1377
+ split: dev
1378
+ type: msmarco
1379
+ metrics:
1380
+ - type: map_at_1
1381
+ value: 23.132
1382
+ - type: map_at_10
1383
+ value: 35.543
1384
+ - type: map_at_100
1385
+ value: 36.702
1386
+ - type: map_at_1000
1387
+ value: 36.748999999999995
1388
+ - type: map_at_3
1389
+ value: 31.737
1390
+ - type: map_at_5
1391
+ value: 33.927
1392
+ - type: mrr_at_1
1393
+ value: 23.782
1394
+ - type: mrr_at_10
1395
+ value: 36.204
1396
+ - type: mrr_at_100
1397
+ value: 37.29
1398
+ - type: mrr_at_1000
1399
+ value: 37.330999999999996
1400
+ - type: mrr_at_3
1401
+ value: 32.458999999999996
1402
+ - type: mrr_at_5
1403
+ value: 34.631
1404
+ - type: ndcg_at_1
1405
+ value: 23.782
1406
+ - type: ndcg_at_10
1407
+ value: 42.492999999999995
1408
+ - type: ndcg_at_100
1409
+ value: 47.985
1410
+ - type: ndcg_at_1000
1411
+ value: 49.141
1412
+ - type: ndcg_at_3
1413
+ value: 34.748000000000005
1414
+ - type: ndcg_at_5
1415
+ value: 38.651
1416
+ - type: precision_at_1
1417
+ value: 23.782
1418
+ - type: precision_at_10
1419
+ value: 6.665
1420
+ - type: precision_at_100
1421
+ value: 0.941
1422
+ - type: precision_at_1000
1423
+ value: 0.104
1424
+ - type: precision_at_3
1425
+ value: 14.776
1426
+ - type: precision_at_5
1427
+ value: 10.84
1428
+ - type: recall_at_1
1429
+ value: 23.132
1430
+ - type: recall_at_10
1431
+ value: 63.794
1432
+ - type: recall_at_100
1433
+ value: 89.027
1434
+ - type: recall_at_1000
1435
+ value: 97.807
1436
+ - type: recall_at_3
1437
+ value: 42.765
1438
+ - type: recall_at_5
1439
+ value: 52.11
1440
+ task:
1441
+ type: Retrieval
1442
+ - dataset:
1443
+ config: en
1444
+ name: MTEB MTOPDomainClassification (en)
1445
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1446
+ split: test
1447
+ type: mteb/mtop_domain
1448
+ metrics:
1449
+ - type: accuracy
1450
+ value: 94.59188326493388
1451
+ - type: f1
1452
+ value: 94.3842594786827
1453
+ task:
1454
+ type: Classification
1455
+ - dataset:
1456
+ config: en
1457
+ name: MTEB MTOPIntentClassification (en)
1458
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1459
+ split: test
1460
+ type: mteb/mtop_intent
1461
+ metrics:
1462
+ - type: accuracy
1463
+ value: 79.49384404924761
1464
+ - type: f1
1465
+ value: 59.7580539534629
1466
+ task:
1467
+ type: Classification
1468
+ - dataset:
1469
+ config: en
1470
+ name: MTEB MassiveIntentClassification (en)
1471
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1472
+ split: test
1473
+ type: mteb/amazon_massive_intent
1474
+ metrics:
1475
+ - type: accuracy
1476
+ value: 77.56220578345663
1477
+ - type: f1
1478
+ value: 75.27228165561478
1479
+ task:
1480
+ type: Classification
1481
+ - dataset:
1482
+ config: en
1483
+ name: MTEB MassiveScenarioClassification (en)
1484
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1485
+ split: test
1486
+ type: mteb/amazon_massive_scenario
1487
+ metrics:
1488
+ - type: accuracy
1489
+ value: 80.53463349024884
1490
+ - type: f1
1491
+ value: 80.4893958236536
1492
+ task:
1493
+ type: Classification
1494
+ - dataset:
1495
+ config: default
1496
+ name: MTEB MedrxivClusteringP2P
1497
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1498
+ split: test
1499
+ type: mteb/medrxiv-clustering-p2p
1500
+ metrics:
1501
+ - type: v_measure
1502
+ value: 32.56100273484962
1503
+ task:
1504
+ type: Clustering
1505
+ - dataset:
1506
+ config: default
1507
+ name: MTEB MedrxivClusteringS2S
1508
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1509
+ split: test
1510
+ type: mteb/medrxiv-clustering-s2s
1511
+ metrics:
1512
+ - type: v_measure
1513
+ value: 31.470380028839607
1514
+ task:
1515
+ type: Clustering
1516
+ - dataset:
1517
+ config: default
1518
+ name: MTEB MindSmallReranking
1519
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1520
+ split: test
1521
+ type: mteb/mind_small
1522
+ metrics:
1523
+ - type: map
1524
+ value: 32.06102792457849
1525
+ - type: mrr
1526
+ value: 33.30709199672238
1527
+ task:
1528
+ type: Reranking
1529
+ - dataset:
1530
+ config: default
1531
+ name: MTEB NFCorpus
1532
+ revision: None
1533
+ split: test
1534
+ type: nfcorpus
1535
+ metrics:
1536
+ - type: map_at_1
1537
+ value: 6.776999999999999
1538
+ - type: map_at_10
1539
+ value: 14.924000000000001
1540
+ - type: map_at_100
1541
+ value: 18.955
1542
+ - type: map_at_1000
1543
+ value: 20.538999999999998
1544
+ - type: map_at_3
1545
+ value: 10.982
1546
+ - type: map_at_5
1547
+ value: 12.679000000000002
1548
+ - type: mrr_at_1
1549
+ value: 47.988
1550
+ - type: mrr_at_10
1551
+ value: 57.232000000000006
1552
+ - type: mrr_at_100
1553
+ value: 57.818999999999996
1554
+ - type: mrr_at_1000
1555
+ value: 57.847
1556
+ - type: mrr_at_3
1557
+ value: 54.901999999999994
1558
+ - type: mrr_at_5
1559
+ value: 56.481
1560
+ - type: ndcg_at_1
1561
+ value: 46.594
1562
+ - type: ndcg_at_10
1563
+ value: 38.129000000000005
1564
+ - type: ndcg_at_100
1565
+ value: 35.54
1566
+ - type: ndcg_at_1000
1567
+ value: 44.172
1568
+ - type: ndcg_at_3
1569
+ value: 43.025999999999996
1570
+ - type: ndcg_at_5
1571
+ value: 41.052
1572
+ - type: precision_at_1
1573
+ value: 47.988
1574
+ - type: precision_at_10
1575
+ value: 28.111000000000004
1576
+ - type: precision_at_100
1577
+ value: 8.929
1578
+ - type: precision_at_1000
1579
+ value: 2.185
1580
+ - type: precision_at_3
1581
+ value: 40.144000000000005
1582
+ - type: precision_at_5
1583
+ value: 35.232
1584
+ - type: recall_at_1
1585
+ value: 6.776999999999999
1586
+ - type: recall_at_10
1587
+ value: 19.289
1588
+ - type: recall_at_100
1589
+ value: 36.359
1590
+ - type: recall_at_1000
1591
+ value: 67.54
1592
+ - type: recall_at_3
1593
+ value: 11.869
1594
+ - type: recall_at_5
1595
+ value: 14.999
1596
+ task:
1597
+ type: Retrieval
1598
+ - dataset:
1599
+ config: default
1600
+ name: MTEB NQ
1601
+ revision: None
1602
+ split: test
1603
+ type: nq
1604
+ metrics:
1605
+ - type: map_at_1
1606
+ value: 31.108000000000004
1607
+ - type: map_at_10
1608
+ value: 47.126000000000005
1609
+ - type: map_at_100
1610
+ value: 48.171
1611
+ - type: map_at_1000
1612
+ value: 48.199
1613
+ - type: map_at_3
1614
+ value: 42.734
1615
+ - type: map_at_5
1616
+ value: 45.362
1617
+ - type: mrr_at_1
1618
+ value: 34.936
1619
+ - type: mrr_at_10
1620
+ value: 49.571
1621
+ - type: mrr_at_100
1622
+ value: 50.345
1623
+ - type: mrr_at_1000
1624
+ value: 50.363
1625
+ - type: mrr_at_3
1626
+ value: 45.959
1627
+ - type: mrr_at_5
1628
+ value: 48.165
1629
+ - type: ndcg_at_1
1630
+ value: 34.936
1631
+ - type: ndcg_at_10
1632
+ value: 55.028999999999996
1633
+ - type: ndcg_at_100
1634
+ value: 59.244
1635
+ - type: ndcg_at_1000
1636
+ value: 59.861
1637
+ - type: ndcg_at_3
1638
+ value: 46.872
1639
+ - type: ndcg_at_5
1640
+ value: 51.217999999999996
1641
+ - type: precision_at_1
1642
+ value: 34.936
1643
+ - type: precision_at_10
1644
+ value: 9.099
1645
+ - type: precision_at_100
1646
+ value: 1.145
1647
+ - type: precision_at_1000
1648
+ value: 0.12
1649
+ - type: precision_at_3
1650
+ value: 21.456
1651
+ - type: precision_at_5
1652
+ value: 15.411
1653
+ - type: recall_at_1
1654
+ value: 31.108000000000004
1655
+ - type: recall_at_10
1656
+ value: 76.53999999999999
1657
+ - type: recall_at_100
1658
+ value: 94.39
1659
+ - type: recall_at_1000
1660
+ value: 98.947
1661
+ - type: recall_at_3
1662
+ value: 55.572
1663
+ - type: recall_at_5
1664
+ value: 65.525
1665
+ task:
1666
+ type: Retrieval
1667
+ - dataset:
1668
+ config: default
1669
+ name: MTEB QuoraRetrieval
1670
+ revision: None
1671
+ split: test
1672
+ type: quora
1673
+ metrics:
1674
+ - type: map_at_1
1675
+ value: 71.56400000000001
1676
+ - type: map_at_10
1677
+ value: 85.482
1678
+ - type: map_at_100
1679
+ value: 86.114
1680
+ - type: map_at_1000
1681
+ value: 86.13
1682
+ - type: map_at_3
1683
+ value: 82.607
1684
+ - type: map_at_5
1685
+ value: 84.405
1686
+ - type: mrr_at_1
1687
+ value: 82.42
1688
+ - type: mrr_at_10
1689
+ value: 88.304
1690
+ - type: mrr_at_100
1691
+ value: 88.399
1692
+ - type: mrr_at_1000
1693
+ value: 88.399
1694
+ - type: mrr_at_3
1695
+ value: 87.37
1696
+ - type: mrr_at_5
1697
+ value: 88.024
1698
+ - type: ndcg_at_1
1699
+ value: 82.45
1700
+ - type: ndcg_at_10
1701
+ value: 89.06500000000001
1702
+ - type: ndcg_at_100
1703
+ value: 90.232
1704
+ - type: ndcg_at_1000
1705
+ value: 90.305
1706
+ - type: ndcg_at_3
1707
+ value: 86.375
1708
+ - type: ndcg_at_5
1709
+ value: 87.85300000000001
1710
+ - type: precision_at_1
1711
+ value: 82.45
1712
+ - type: precision_at_10
1713
+ value: 13.486999999999998
1714
+ - type: precision_at_100
1715
+ value: 1.534
1716
+ - type: precision_at_1000
1717
+ value: 0.157
1718
+ - type: precision_at_3
1719
+ value: 37.813
1720
+ - type: precision_at_5
1721
+ value: 24.773999999999997
1722
+ - type: recall_at_1
1723
+ value: 71.56400000000001
1724
+ - type: recall_at_10
1725
+ value: 95.812
1726
+ - type: recall_at_100
1727
+ value: 99.7
1728
+ - type: recall_at_1000
1729
+ value: 99.979
1730
+ - type: recall_at_3
1731
+ value: 87.966
1732
+ - type: recall_at_5
1733
+ value: 92.268
1734
+ task:
1735
+ type: Retrieval
1736
+ - dataset:
1737
+ config: default
1738
+ name: MTEB RedditClustering
1739
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1740
+ split: test
1741
+ type: mteb/reddit-clustering
1742
+ metrics:
1743
+ - type: v_measure
1744
+ value: 57.241876648614145
1745
+ task:
1746
+ type: Clustering
1747
+ - dataset:
1748
+ config: default
1749
+ name: MTEB RedditClusteringP2P
1750
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
1751
+ split: test
1752
+ type: mteb/reddit-clustering-p2p
1753
+ metrics:
1754
+ - type: v_measure
1755
+ value: 64.66212576446223
1756
+ task:
1757
+ type: Clustering
1758
+ - dataset:
1759
+ config: default
1760
+ name: MTEB SCIDOCS
1761
+ revision: None
1762
+ split: test
1763
+ type: scidocs
1764
+ metrics:
1765
+ - type: map_at_1
1766
+ value: 5.308
1767
+ - type: map_at_10
1768
+ value: 13.803
1769
+ - type: map_at_100
1770
+ value: 16.176
1771
+ - type: map_at_1000
1772
+ value: 16.561
1773
+ - type: map_at_3
1774
+ value: 9.761000000000001
1775
+ - type: map_at_5
1776
+ value: 11.802
1777
+ - type: mrr_at_1
1778
+ value: 26.200000000000003
1779
+ - type: mrr_at_10
1780
+ value: 37.621
1781
+ - type: mrr_at_100
1782
+ value: 38.767
1783
+ - type: mrr_at_1000
1784
+ value: 38.815
1785
+ - type: mrr_at_3
1786
+ value: 34.117
1787
+ - type: mrr_at_5
1788
+ value: 36.107
1789
+ - type: ndcg_at_1
1790
+ value: 26.200000000000003
1791
+ - type: ndcg_at_10
1792
+ value: 22.64
1793
+ - type: ndcg_at_100
1794
+ value: 31.567
1795
+ - type: ndcg_at_1000
1796
+ value: 37.623
1797
+ - type: ndcg_at_3
1798
+ value: 21.435000000000002
1799
+ - type: ndcg_at_5
1800
+ value: 18.87
1801
+ - type: precision_at_1
1802
+ value: 26.200000000000003
1803
+ - type: precision_at_10
1804
+ value: 11.74
1805
+ - type: precision_at_100
1806
+ value: 2.465
1807
+ - type: precision_at_1000
1808
+ value: 0.391
1809
+ - type: precision_at_3
1810
+ value: 20.033
1811
+ - type: precision_at_5
1812
+ value: 16.64
1813
+ - type: recall_at_1
1814
+ value: 5.308
1815
+ - type: recall_at_10
1816
+ value: 23.794999999999998
1817
+ - type: recall_at_100
1818
+ value: 50.015
1819
+ - type: recall_at_1000
1820
+ value: 79.283
1821
+ - type: recall_at_3
1822
+ value: 12.178
1823
+ - type: recall_at_5
1824
+ value: 16.882
1825
+ task:
1826
+ type: Retrieval
1827
+ - dataset:
1828
+ config: default
1829
+ name: MTEB SICK-R
1830
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1831
+ split: test
1832
+ type: mteb/sickr-sts
1833
+ metrics:
1834
+ - type: cos_sim_pearson
1835
+ value: 84.93231134675553
1836
+ - type: cos_sim_spearman
1837
+ value: 81.68319292603205
1838
+ - type: euclidean_pearson
1839
+ value: 81.8396814380367
1840
+ - type: euclidean_spearman
1841
+ value: 81.24641903349945
1842
+ - type: manhattan_pearson
1843
+ value: 81.84698799204274
1844
+ - type: manhattan_spearman
1845
+ value: 81.24269997904105
1846
+ task:
1847
+ type: STS
1848
+ - dataset:
1849
+ config: default
1850
+ name: MTEB STS12
1851
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1852
+ split: test
1853
+ type: mteb/sts12-sts
1854
+ metrics:
1855
+ - type: cos_sim_pearson
1856
+ value: 86.73241671587446
1857
+ - type: cos_sim_spearman
1858
+ value: 79.05091082971826
1859
+ - type: euclidean_pearson
1860
+ value: 83.91146869578044
1861
+ - type: euclidean_spearman
1862
+ value: 79.87978465370936
1863
+ - type: manhattan_pearson
1864
+ value: 83.90888338917678
1865
+ - type: manhattan_spearman
1866
+ value: 79.87482848584241
1867
+ task:
1868
+ type: STS
1869
+ - dataset:
1870
+ config: default
1871
+ name: MTEB STS13
1872
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1873
+ split: test
1874
+ type: mteb/sts13-sts
1875
+ metrics:
1876
+ - type: cos_sim_pearson
1877
+ value: 85.14970731146177
1878
+ - type: cos_sim_spearman
1879
+ value: 86.37363490084627
1880
+ - type: euclidean_pearson
1881
+ value: 83.02154218530433
1882
+ - type: euclidean_spearman
1883
+ value: 83.80258761957367
1884
+ - type: manhattan_pearson
1885
+ value: 83.01664495119347
1886
+ - type: manhattan_spearman
1887
+ value: 83.77567458007952
1888
+ task:
1889
+ type: STS
1890
+ - dataset:
1891
+ config: default
1892
+ name: MTEB STS14
1893
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
1894
+ split: test
1895
+ type: mteb/sts14-sts
1896
+ metrics:
1897
+ - type: cos_sim_pearson
1898
+ value: 83.40474139886784
1899
+ - type: cos_sim_spearman
1900
+ value: 82.77768789165984
1901
+ - type: euclidean_pearson
1902
+ value: 80.7065877443695
1903
+ - type: euclidean_spearman
1904
+ value: 81.375940662505
1905
+ - type: manhattan_pearson
1906
+ value: 80.6507552270278
1907
+ - type: manhattan_spearman
1908
+ value: 81.32782179098741
1909
+ task:
1910
+ type: STS
1911
+ - dataset:
1912
+ config: default
1913
+ name: MTEB STS15
1914
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
1915
+ split: test
1916
+ type: mteb/sts15-sts
1917
+ metrics:
1918
+ - type: cos_sim_pearson
1919
+ value: 87.08585968722274
1920
+ - type: cos_sim_spearman
1921
+ value: 88.03110031451399
1922
+ - type: euclidean_pearson
1923
+ value: 85.74012019602384
1924
+ - type: euclidean_spearman
1925
+ value: 86.13592849438209
1926
+ - type: manhattan_pearson
1927
+ value: 85.74404842369206
1928
+ - type: manhattan_spearman
1929
+ value: 86.14492318960154
1930
+ task:
1931
+ type: STS
1932
+ - dataset:
1933
+ config: default
1934
+ name: MTEB STS16
1935
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
1936
+ split: test
1937
+ type: mteb/sts16-sts
1938
+ metrics:
1939
+ - type: cos_sim_pearson
1940
+ value: 84.95069052788875
1941
+ - type: cos_sim_spearman
1942
+ value: 86.4867991595147
1943
+ - type: euclidean_pearson
1944
+ value: 84.31013325754635
1945
+ - type: euclidean_spearman
1946
+ value: 85.01529258006482
1947
+ - type: manhattan_pearson
1948
+ value: 84.26995570085374
1949
+ - type: manhattan_spearman
1950
+ value: 84.96982104986162
1951
+ task:
1952
+ type: STS
1953
+ - dataset:
1954
+ config: en-en
1955
+ name: MTEB STS17 (en-en)
1956
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
1957
+ split: test
1958
+ type: mteb/sts17-crosslingual-sts
1959
+ metrics:
1960
+ - type: cos_sim_pearson
1961
+ value: 87.54617647971897
1962
+ - type: cos_sim_spearman
1963
+ value: 87.49834181751034
1964
+ - type: euclidean_pearson
1965
+ value: 86.01015322577122
1966
+ - type: euclidean_spearman
1967
+ value: 84.63362652063199
1968
+ - type: manhattan_pearson
1969
+ value: 86.13807574475706
1970
+ - type: manhattan_spearman
1971
+ value: 84.7772370721132
1972
+ task:
1973
+ type: STS
1974
+ - dataset:
1975
+ config: en
1976
+ name: MTEB STS22 (en)
1977
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
1978
+ split: test
1979
+ type: mteb/sts22-crosslingual-sts
1980
+ metrics:
1981
+ - type: cos_sim_pearson
1982
+ value: 67.20047755786615
1983
+ - type: cos_sim_spearman
1984
+ value: 67.05324077987636
1985
+ - type: euclidean_pearson
1986
+ value: 66.91930642976601
1987
+ - type: euclidean_spearman
1988
+ value: 65.21491856099105
1989
+ - type: manhattan_pearson
1990
+ value: 66.78756851976624
1991
+ - type: manhattan_spearman
1992
+ value: 65.12356257740728
1993
+ task:
1994
+ type: STS
1995
+ - dataset:
1996
+ config: default
1997
+ name: MTEB STSBenchmark
1998
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
1999
+ split: test
2000
+ type: mteb/stsbenchmark-sts
2001
+ metrics:
2002
+ - type: cos_sim_pearson
2003
+ value: 86.19852871539686
2004
+ - type: cos_sim_spearman
2005
+ value: 87.5161895296395
2006
+ - type: euclidean_pearson
2007
+ value: 84.59848645207485
2008
+ - type: euclidean_spearman
2009
+ value: 85.26427328757919
2010
+ - type: manhattan_pearson
2011
+ value: 84.59747366996524
2012
+ - type: manhattan_spearman
2013
+ value: 85.24045855146915
2014
+ task:
2015
+ type: STS
2016
+ - dataset:
2017
+ config: default
2018
+ name: MTEB SciDocsRR
2019
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2020
+ split: test
2021
+ type: mteb/scidocs-reranking
2022
+ metrics:
2023
+ - type: map
2024
+ value: 87.63320317811032
2025
+ - type: mrr
2026
+ value: 96.26242947321379
2027
+ task:
2028
+ type: Reranking
2029
+ - dataset:
2030
+ config: default
2031
+ name: MTEB SciFact
2032
+ revision: None
2033
+ split: test
2034
+ type: scifact
2035
+ metrics:
2036
+ - type: map_at_1
2037
+ value: 60.928000000000004
2038
+ - type: map_at_10
2039
+ value: 70.112
2040
+ - type: map_at_100
2041
+ value: 70.59299999999999
2042
+ - type: map_at_1000
2043
+ value: 70.623
2044
+ - type: map_at_3
2045
+ value: 66.846
2046
+ - type: map_at_5
2047
+ value: 68.447
2048
+ - type: mrr_at_1
2049
+ value: 64.0
2050
+ - type: mrr_at_10
2051
+ value: 71.212
2052
+ - type: mrr_at_100
2053
+ value: 71.616
2054
+ - type: mrr_at_1000
2055
+ value: 71.64500000000001
2056
+ - type: mrr_at_3
2057
+ value: 68.77799999999999
2058
+ - type: mrr_at_5
2059
+ value: 70.094
2060
+ - type: ndcg_at_1
2061
+ value: 64.0
2062
+ - type: ndcg_at_10
2063
+ value: 74.607
2064
+ - type: ndcg_at_100
2065
+ value: 76.416
2066
+ - type: ndcg_at_1000
2067
+ value: 77.102
2068
+ - type: ndcg_at_3
2069
+ value: 69.126
2070
+ - type: ndcg_at_5
2071
+ value: 71.41300000000001
2072
+ - type: precision_at_1
2073
+ value: 64.0
2074
+ - type: precision_at_10
2075
+ value: 9.933
2076
+ - type: precision_at_100
2077
+ value: 1.077
2078
+ - type: precision_at_1000
2079
+ value: 0.11299999999999999
2080
+ - type: precision_at_3
2081
+ value: 26.556
2082
+ - type: precision_at_5
2083
+ value: 17.467
2084
+ - type: recall_at_1
2085
+ value: 60.928000000000004
2086
+ - type: recall_at_10
2087
+ value: 87.322
2088
+ - type: recall_at_100
2089
+ value: 94.833
2090
+ - type: recall_at_1000
2091
+ value: 100.0
2092
+ - type: recall_at_3
2093
+ value: 72.628
2094
+ - type: recall_at_5
2095
+ value: 78.428
2096
+ task:
2097
+ type: Retrieval
2098
+ - dataset:
2099
+ config: default
2100
+ name: MTEB SprintDuplicateQuestions
2101
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2102
+ split: test
2103
+ type: mteb/sprintduplicatequestions-pairclassification
2104
+ metrics:
2105
+ - type: cos_sim_accuracy
2106
+ value: 99.86237623762376
2107
+ - type: cos_sim_ap
2108
+ value: 96.72586477206649
2109
+ - type: cos_sim_f1
2110
+ value: 93.01858362631845
2111
+ - type: cos_sim_precision
2112
+ value: 93.4409687184662
2113
+ - type: cos_sim_recall
2114
+ value: 92.60000000000001
2115
+ - type: dot_accuracy
2116
+ value: 99.78019801980199
2117
+ - type: dot_ap
2118
+ value: 93.72748205246228
2119
+ - type: dot_f1
2120
+ value: 89.04109589041096
2121
+ - type: dot_precision
2122
+ value: 87.16475095785441
2123
+ - type: dot_recall
2124
+ value: 91.0
2125
+ - type: euclidean_accuracy
2126
+ value: 99.85445544554456
2127
+ - type: euclidean_ap
2128
+ value: 96.6661459876145
2129
+ - type: euclidean_f1
2130
+ value: 92.58337481333997
2131
+ - type: euclidean_precision
2132
+ value: 92.17046580773042
2133
+ - type: euclidean_recall
2134
+ value: 93.0
2135
+ - type: manhattan_accuracy
2136
+ value: 99.85445544554456
2137
+ - type: manhattan_ap
2138
+ value: 96.6883549244056
2139
+ - type: manhattan_f1
2140
+ value: 92.57598405580468
2141
+ - type: manhattan_precision
2142
+ value: 92.25422045680239
2143
+ - type: manhattan_recall
2144
+ value: 92.9
2145
+ - type: max_accuracy
2146
+ value: 99.86237623762376
2147
+ - type: max_ap
2148
+ value: 96.72586477206649
2149
+ - type: max_f1
2150
+ value: 93.01858362631845
2151
+ task:
2152
+ type: PairClassification
2153
+ - dataset:
2154
+ config: default
2155
+ name: MTEB StackExchangeClustering
2156
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2157
+ split: test
2158
+ type: mteb/stackexchange-clustering
2159
+ metrics:
2160
+ - type: v_measure
2161
+ value: 66.39930057069995
2162
+ task:
2163
+ type: Clustering
2164
+ - dataset:
2165
+ config: default
2166
+ name: MTEB StackExchangeClusteringP2P
2167
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2168
+ split: test
2169
+ type: mteb/stackexchange-clustering-p2p
2170
+ metrics:
2171
+ - type: v_measure
2172
+ value: 34.96398659903402
2173
+ task:
2174
+ type: Clustering
2175
+ - dataset:
2176
+ config: default
2177
+ name: MTEB StackOverflowDupQuestions
2178
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2179
+ split: test
2180
+ type: mteb/stackoverflowdupquestions-reranking
2181
+ metrics:
2182
+ - type: map
2183
+ value: 55.946944700355395
2184
+ - type: mrr
2185
+ value: 56.97151398438164
2186
+ task:
2187
+ type: Reranking
2188
+ - dataset:
2189
+ config: default
2190
+ name: MTEB SummEval
2191
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2192
+ split: test
2193
+ type: mteb/summeval
2194
+ metrics:
2195
+ - type: cos_sim_pearson
2196
+ value: 31.541657650692905
2197
+ - type: cos_sim_spearman
2198
+ value: 31.605804192286303
2199
+ - type: dot_pearson
2200
+ value: 28.26905996736398
2201
+ - type: dot_spearman
2202
+ value: 27.864801765851187
2203
+ task:
2204
+ type: Summarization
2205
+ - dataset:
2206
+ config: default
2207
+ name: MTEB TRECCOVID
2208
+ revision: None
2209
+ split: test
2210
+ type: trec-covid
2211
+ metrics:
2212
+ - type: map_at_1
2213
+ value: 0.22599999999999998
2214
+ - type: map_at_10
2215
+ value: 1.8870000000000002
2216
+ - type: map_at_100
2217
+ value: 9.78
2218
+ - type: map_at_1000
2219
+ value: 22.514
2220
+ - type: map_at_3
2221
+ value: 0.6669999999999999
2222
+ - type: map_at_5
2223
+ value: 1.077
2224
+ - type: mrr_at_1
2225
+ value: 82.0
2226
+ - type: mrr_at_10
2227
+ value: 89.86699999999999
2228
+ - type: mrr_at_100
2229
+ value: 89.86699999999999
2230
+ - type: mrr_at_1000
2231
+ value: 89.86699999999999
2232
+ - type: mrr_at_3
2233
+ value: 89.667
2234
+ - type: mrr_at_5
2235
+ value: 89.667
2236
+ - type: ndcg_at_1
2237
+ value: 79.0
2238
+ - type: ndcg_at_10
2239
+ value: 74.818
2240
+ - type: ndcg_at_100
2241
+ value: 53.715999999999994
2242
+ - type: ndcg_at_1000
2243
+ value: 47.082
2244
+ - type: ndcg_at_3
2245
+ value: 82.134
2246
+ - type: ndcg_at_5
2247
+ value: 79.81899999999999
2248
+ - type: precision_at_1
2249
+ value: 82.0
2250
+ - type: precision_at_10
2251
+ value: 78.0
2252
+ - type: precision_at_100
2253
+ value: 54.48
2254
+ - type: precision_at_1000
2255
+ value: 20.518
2256
+ - type: precision_at_3
2257
+ value: 87.333
2258
+ - type: precision_at_5
2259
+ value: 85.2
2260
+ - type: recall_at_1
2261
+ value: 0.22599999999999998
2262
+ - type: recall_at_10
2263
+ value: 2.072
2264
+ - type: recall_at_100
2265
+ value: 13.013
2266
+ - type: recall_at_1000
2267
+ value: 43.462
2268
+ - type: recall_at_3
2269
+ value: 0.695
2270
+ - type: recall_at_5
2271
+ value: 1.139
2272
+ task:
2273
+ type: Retrieval
2274
+ - dataset:
2275
+ config: default
2276
+ name: MTEB Touche2020
2277
+ revision: None
2278
+ split: test
2279
+ type: webis-touche2020
2280
+ metrics:
2281
+ - type: map_at_1
2282
+ value: 2.328
2283
+ - type: map_at_10
2284
+ value: 9.795
2285
+ - type: map_at_100
2286
+ value: 15.801000000000002
2287
+ - type: map_at_1000
2288
+ value: 17.23
2289
+ - type: map_at_3
2290
+ value: 4.734
2291
+ - type: map_at_5
2292
+ value: 6.644
2293
+ - type: mrr_at_1
2294
+ value: 30.612000000000002
2295
+ - type: mrr_at_10
2296
+ value: 46.902
2297
+ - type: mrr_at_100
2298
+ value: 47.495
2299
+ - type: mrr_at_1000
2300
+ value: 47.495
2301
+ - type: mrr_at_3
2302
+ value: 41.156
2303
+ - type: mrr_at_5
2304
+ value: 44.218
2305
+ - type: ndcg_at_1
2306
+ value: 28.571
2307
+ - type: ndcg_at_10
2308
+ value: 24.806
2309
+ - type: ndcg_at_100
2310
+ value: 36.419000000000004
2311
+ - type: ndcg_at_1000
2312
+ value: 47.272999999999996
2313
+ - type: ndcg_at_3
2314
+ value: 25.666
2315
+ - type: ndcg_at_5
2316
+ value: 25.448999999999998
2317
+ - type: precision_at_1
2318
+ value: 30.612000000000002
2319
+ - type: precision_at_10
2320
+ value: 23.061
2321
+ - type: precision_at_100
2322
+ value: 7.714
2323
+ - type: precision_at_1000
2324
+ value: 1.484
2325
+ - type: precision_at_3
2326
+ value: 26.531
2327
+ - type: precision_at_5
2328
+ value: 26.122
2329
+ - type: recall_at_1
2330
+ value: 2.328
2331
+ - type: recall_at_10
2332
+ value: 16.524
2333
+ - type: recall_at_100
2334
+ value: 47.179
2335
+ - type: recall_at_1000
2336
+ value: 81.22200000000001
2337
+ - type: recall_at_3
2338
+ value: 5.745
2339
+ - type: recall_at_5
2340
+ value: 9.339
2341
+ task:
2342
+ type: Retrieval
2343
+ - dataset:
2344
+ config: default
2345
+ name: MTEB ToxicConversationsClassification
2346
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2347
+ split: test
2348
+ type: mteb/toxic_conversations_50k
2349
+ metrics:
2350
+ - type: accuracy
2351
+ value: 70.9142
2352
+ - type: ap
2353
+ value: 14.335574772555415
2354
+ - type: f1
2355
+ value: 54.62839595194111
2356
+ task:
2357
+ type: Classification
2358
+ - dataset:
2359
+ config: default
2360
+ name: MTEB TweetSentimentExtractionClassification
2361
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2362
+ split: test
2363
+ type: mteb/tweet_sentiment_extraction
2364
+ metrics:
2365
+ - type: accuracy
2366
+ value: 59.94340690435768
2367
+ - type: f1
2368
+ value: 60.286487936731916
2369
+ task:
2370
+ type: Classification
2371
+ - dataset:
2372
+ config: default
2373
+ name: MTEB TwentyNewsgroupsClustering
2374
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2375
+ split: test
2376
+ type: mteb/twentynewsgroups-clustering
2377
+ metrics:
2378
+ - type: v_measure
2379
+ value: 51.26597708987974
2380
+ task:
2381
+ type: Clustering
2382
+ - dataset:
2383
+ config: default
2384
+ name: MTEB TwitterSemEval2015
2385
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2386
+ split: test
2387
+ type: mteb/twittersemeval2015-pairclassification
2388
+ metrics:
2389
+ - type: cos_sim_accuracy
2390
+ value: 87.48882398521786
2391
+ - type: cos_sim_ap
2392
+ value: 79.04326607602204
2393
+ - type: cos_sim_f1
2394
+ value: 71.64566826860633
2395
+ - type: cos_sim_precision
2396
+ value: 70.55512918905092
2397
+ - type: cos_sim_recall
2398
+ value: 72.77044854881267
2399
+ - type: dot_accuracy
2400
+ value: 84.19264469213805
2401
+ - type: dot_ap
2402
+ value: 67.96360043562528
2403
+ - type: dot_f1
2404
+ value: 64.06418393006827
2405
+ - type: dot_precision
2406
+ value: 58.64941898706424
2407
+ - type: dot_recall
2408
+ value: 70.58047493403694
2409
+ - type: euclidean_accuracy
2410
+ value: 87.45902127913214
2411
+ - type: euclidean_ap
2412
+ value: 78.9742237648272
2413
+ - type: euclidean_f1
2414
+ value: 71.5553235908142
2415
+ - type: euclidean_precision
2416
+ value: 70.77955601445535
2417
+ - type: euclidean_recall
2418
+ value: 72.34828496042216
2419
+ - type: manhattan_accuracy
2420
+ value: 87.41729749061214
2421
+ - type: manhattan_ap
2422
+ value: 78.90073137580596
2423
+ - type: manhattan_f1
2424
+ value: 71.3942611553533
2425
+ - type: manhattan_precision
2426
+ value: 68.52705653967483
2427
+ - type: manhattan_recall
2428
+ value: 74.51187335092348
2429
+ - type: max_accuracy
2430
+ value: 87.48882398521786
2431
+ - type: max_ap
2432
+ value: 79.04326607602204
2433
+ - type: max_f1
2434
+ value: 71.64566826860633
2435
+ task:
2436
+ type: PairClassification
2437
+ - dataset:
2438
+ config: default
2439
+ name: MTEB TwitterURLCorpus
2440
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2441
+ split: test
2442
+ type: mteb/twitterurlcorpus-pairclassification
2443
+ metrics:
2444
+ - type: cos_sim_accuracy
2445
+ value: 88.68125897465751
2446
+ - type: cos_sim_ap
2447
+ value: 85.6003454431979
2448
+ - type: cos_sim_f1
2449
+ value: 77.6957163958641
2450
+ - type: cos_sim_precision
2451
+ value: 73.0110366307807
2452
+ - type: cos_sim_recall
2453
+ value: 83.02279026793964
2454
+ - type: dot_accuracy
2455
+ value: 87.7672992587418
2456
+ - type: dot_ap
2457
+ value: 82.4971301112899
2458
+ - type: dot_f1
2459
+ value: 75.90528233151184
2460
+ - type: dot_precision
2461
+ value: 72.0370626469368
2462
+ - type: dot_recall
2463
+ value: 80.21250384970742
2464
+ - type: euclidean_accuracy
2465
+ value: 88.4503434625684
2466
+ - type: euclidean_ap
2467
+ value: 84.91949884748384
2468
+ - type: euclidean_f1
2469
+ value: 76.92365018444684
2470
+ - type: euclidean_precision
2471
+ value: 74.53245721712759
2472
+ - type: euclidean_recall
2473
+ value: 79.47336002463813
2474
+ - type: manhattan_accuracy
2475
+ value: 88.47556952691427
2476
+ - type: manhattan_ap
2477
+ value: 84.8963689101517
2478
+ - type: manhattan_f1
2479
+ value: 76.85901249256395
2480
+ - type: manhattan_precision
2481
+ value: 74.31693989071039
2482
+ - type: manhattan_recall
2483
+ value: 79.58115183246073
2484
+ - type: max_accuracy
2485
+ value: 88.68125897465751
2486
+ - type: max_ap
2487
+ value: 85.6003454431979
2488
+ - type: max_f1
2489
+ value: 77.6957163958641
2490
+ task:
2491
+ type: PairClassification
2492
+ tags:
2493
+ - sentence-transformers
2494
+ - feature-extraction
2495
+ - sentence-similarity
2496
+ - transformers
2497
+ - mteb
2498
+ - onnx
2499
+ - teradata
2500
+
2501
+ ---
2502
+ # A Teradata Vantage compatible Embeddings Model
2503
+
2504
+ # BAAI/bge-large-en-v1.5
2505
+
2506
+ ## Overview of this Model
2507
+
2508
+ An Embedding Model which maps text (sentence/ paragraphs) into a vector. The [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) model well known for its effectiveness in capturing semantic meanings in text data. It's a state-of-the-art model trained on a large corpus, capable of generating high-quality text embeddings.
2509
+
2510
+ - 335.14M params (Sizes in ONNX format - "fp32": 1275.11MB, "int8": 320.63MB, "uint8": 320.63MB)
2511
+ - 512 maximum input tokens
2512
+ - 1024 dimensions of output vector
2513
+ - Licence: mit. The released models can be used for commercial purposes free of charge.
2514
+ - Reference to Original Model: https://huggingface.co/BAAI/bge-large-en-v1.5
2515
+
2516
+
2517
+ ## Quickstart: Deploying this Model in Teradata Vantage
2518
+
2519
+ We have pre-converted the model into the ONNX format compatible with BYOM 6.0, eliminating the need for manual conversion.
2520
+
2521
+ **Note:** Ensure you have access to a Teradata Database with BYOM 6.0 installed.
2522
+
2523
+ To get started, clone the pre-converted model directly from the Teradata HuggingFace repository.
2524
+
2525
+
2526
+ ```python
2527
+
2528
+ import teradataml as tdml
2529
+ import getpass
2530
+ from huggingface_hub import hf_hub_download
2531
+
2532
+ model_name = "bge-large-en-v1.5"
2533
+ number_dimensions_output = 1024
2534
+ model_file_name = "model.onnx"
2535
+
2536
+ # Step 1: Download Model from Teradata HuggingFace Page
2537
+
2538
+ hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"onnx/{model_file_name}", local_dir="./")
2539
+ hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"tokenizer.json", local_dir="./")
2540
+
2541
+ # Step 2: Create Connection to Vantage
2542
+
2543
+ tdml.create_context(host = input('enter your hostname'),
2544
+ username=input('enter your username'),
2545
+ password = getpass.getpass("enter your password"))
2546
+
2547
+ # Step 3: Load Models into Vantage
2548
+ # a) Embedding model
2549
+ tdml.save_byom(model_id = model_name, # must be unique in the models table
2550
+ model_file = model_file_name,
2551
+ table_name = 'embeddings_models' )
2552
+ # b) Tokenizer
2553
+ tdml.save_byom(model_id = model_name, # must be unique in the models table
2554
+ model_file = 'tokenizer.json',
2555
+ table_name = 'embeddings_tokenizers')
2556
+
2557
+ # Step 4: Test ONNXEmbeddings Function
2558
+ # Note that ONNXEmbeddings expects the 'payload' column to be 'txt'.
2559
+ # If it has got a different name, just rename it in a subquery/CTE.
2560
+ input_table = "emails.emails"
2561
+ embeddings_query = f"""
2562
+ SELECT
2563
+ *
2564
+ from mldb.ONNXEmbeddings(
2565
+ on {input_table} as InputTable
2566
+ on (select * from embeddings_models where model_id = '{model_name}') as ModelTable DIMENSION
2567
+ on (select model as tokenizer from embeddings_tokenizers where model_id = '{model_name}') as TokenizerTable DIMENSION
2568
+ using
2569
+ Accumulate('id', 'txt')
2570
+ ModelOutputTensor('sentence_embedding')
2571
+ EnableMemoryCheck('false')
2572
+ OutputFormat('FLOAT32({number_dimensions_output})')
2573
+ OverwriteCachedModel('true')
2574
+ ) a
2575
+ """
2576
+ DF_embeddings = tdml.DataFrame.from_query(embeddings_query)
2577
+ DF_embeddings
2578
+ ```
2579
+
2580
+
2581
+
2582
+ ## What Can I Do with the Embeddings?
2583
+
2584
+ Teradata Vantage includes pre-built in-database functions to process embeddings further. Explore the following examples:
2585
+
2586
+ - **Semantic Clustering with TD_KMeans:** [Semantic Clustering Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Clustering_Python.ipynb)
2587
+ - **Semantic Distance with TD_VectorDistance:** [Semantic Similarity Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Similarity_Python.ipynb)
2588
+ - **RAG-Based Application with TD_VectorDistance:** [RAG and Bedrock Query PDF Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/RAG_and_Bedrock_QueryPDF.ipynb)
2589
+
2590
+
2591
+ ## Deep Dive into Model Conversion to ONNX
2592
+
2593
+ **The steps below outline how we converted the open-source Hugging Face model into an ONNX file compatible with the in-database ONNXEmbeddings function.**
2594
+
2595
+ You do not need to perform these steps—they are provided solely for documentation and transparency. However, they may be helpful if you wish to convert another model to the required format.
2596
+
2597
+
2598
+ ### Part 1. Importing and Converting Model using optimum
2599
+
2600
+ We start by importing the pre-trained [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) model from Hugging Face.
2601
+
2602
+ To enhance performance and ensure compatibility with various execution environments, we'll use the [Optimum](https://github.com/huggingface/optimum) utility to convert the model into the ONNX (Open Neural Network Exchange) format.
2603
+
2604
+ After conversion to ONNX, we are fixing the opset in the ONNX file for compatibility with ONNX runtime used in Teradata Vantage
2605
+
2606
+ We are generating ONNX files for multiple different precisions: fp32, int8, uint8
2607
+
2608
+ You can find the detailed conversion steps in the file [convert.py](./convert.py)
2609
+
2610
+ ### Part 2. Running the model in Python with onnxruntime & compare results
2611
+
2612
+ Once the fixes are applied, we proceed to test the correctness of the ONNX model by calculating cosine similarity between two texts using native SentenceTransformers and ONNX runtime, comparing the results.
2613
+
2614
+ If the results are identical, it confirms that the ONNX model gives the same result as the native models, validating its correctness and suitability for further use in the database.
2615
+
2616
+
2617
+ ```python
2618
+ import onnxruntime as rt
2619
+
2620
+ from sentence_transformers.util import cos_sim
2621
+ from sentence_transformers import SentenceTransformer
2622
+
2623
+ import transformers
2624
+
2625
+
2626
+ sentences_1 = 'How is the weather today?'
2627
+ sentences_2 = 'What is the current weather like today?'
2628
+
2629
+ # Calculate ONNX result
2630
+ tokenizer = transformers.AutoTokenizer.from_pretrained("BAAI/bge-large-en-v1.5")
2631
+ predef_sess = rt.InferenceSession("onnx/model.onnx")
2632
+
2633
+ enc1 = tokenizer(sentences_1)
2634
+ embeddings_1_onnx = predef_sess.run(None, {"input_ids": [enc1.input_ids],
2635
+ "attention_mask": [enc1.attention_mask]})
2636
+
2637
+ enc2 = tokenizer(sentences_2)
2638
+ embeddings_2_onnx = predef_sess.run(None, {"input_ids": [enc2.input_ids],
2639
+ "attention_mask": [enc2.attention_mask]})
2640
+
2641
+
2642
+ # Calculate embeddings with SentenceTransformer
2643
+ model = SentenceTransformer(model_id, trust_remote_code=True)
2644
+ embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True)
2645
+ embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True)
2646
+
2647
+ # Compare results
2648
+ print("Cosine similiarity for embeddings calculated with ONNX:" + str(cos_sim(embeddings_1_onnx[1][0], embeddings_2_onnx[1][0])))
2649
+ print("Cosine similiarity for embeddings calculated with SentenceTransformer:" + str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer)))
2650
+ ```
2651
+
2652
+ You can find the detailed ONNX vs. SentenceTransformer result comparison steps in the file [test_local.py](./test_local.py)
2653
+
config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_attn_implementation_autoset": true,
3
+ "_name_or_path": "BAAI/bge-large-en-v1.5",
4
+ "architectures": [
5
+ "BertModel"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.1,
8
+ "classifier_dropout": null,
9
+ "export_model_type": "transformer",
10
+ "gradient_checkpointing": false,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 1024,
14
+ "id2label": {
15
+ "0": "LABEL_0"
16
+ },
17
+ "initializer_range": 0.02,
18
+ "intermediate_size": 4096,
19
+ "label2id": {
20
+ "LABEL_0": 0
21
+ },
22
+ "layer_norm_eps": 1e-12,
23
+ "max_position_embeddings": 512,
24
+ "model_type": "bert",
25
+ "num_attention_heads": 16,
26
+ "num_hidden_layers": 24,
27
+ "pad_token_id": 0,
28
+ "position_embedding_type": "absolute",
29
+ "torch_dtype": "float32",
30
+ "transformers_version": "4.47.1",
31
+ "type_vocab_size": 2,
32
+ "use_cache": true,
33
+ "vocab_size": 30522
34
+ }
conversion_config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_id": "BAAI/bge-large-en-v1.5",
3
+ "number_of_generated_embeddings": 1024,
4
+ "precision_to_filename_map": {
5
+ "fp32": "onnx/model.onnx",
6
+ "int8": "onnx/model_int8.onnx",
7
+ "uint8": "onnx/model_uint8.onnx"
8
+ },
9
+ "opset": 16,
10
+ "IR": 8
11
+ }
convert.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import shutil
4
+
5
+ from optimum.exporters.onnx import main_export
6
+ import onnx
7
+ from onnxconverter_common import float16
8
+ import onnxruntime as rt
9
+ from onnxruntime.tools.onnx_model_utils import *
10
+ from onnxruntime.quantization import quantize_dynamic, QuantType
11
+
12
+ with open('conversion_config.json') as json_file:
13
+ conversion_config = json.load(json_file)
14
+
15
+
16
+ model_id = conversion_config["model_id"]
17
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
18
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
19
+ opset = conversion_config["opset"]
20
+ IR = conversion_config["IR"]
21
+
22
+
23
+ op = onnx.OperatorSetIdProto()
24
+ op.version = opset
25
+
26
+
27
+ if not os.path.exists("onnx"):
28
+ os.makedirs("onnx")
29
+
30
+ print("Exporting the main model version")
31
+
32
+ main_export(model_name_or_path=model_id, output="./", opset=opset, trust_remote_code=True, task="feature-extraction", dtype="fp32")
33
+
34
+ if "fp32" in precision_to_filename_map:
35
+ print("Exporting the fp32 onnx file...")
36
+
37
+ shutil.copyfile('model.onnx', precision_to_filename_map["fp32"])
38
+
39
+ print("Done\n\n")
40
+
41
+ if "int8" in precision_to_filename_map:
42
+ print("Quantizing fp32 model to int8...")
43
+ quantize_dynamic("model.onnx", precision_to_filename_map["int8"], weight_type=QuantType.QInt8)
44
+ print("Done\n\n")
45
+
46
+ if "uint8" in precision_to_filename_map:
47
+ print("Quantizing fp32 model to uint8...")
48
+ quantize_dynamic("model.onnx", precision_to_filename_map["uint8"], weight_type=QuantType.QUInt8)
49
+ print("Done\n\n")
50
+
51
+ os.remove("model.onnx")
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21b0701524af665e4d0b56087af48ad6e9974d6f2eb7da9a90e9c4729eee13aa
3
+ size 1337046449
onnx/model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b3ed71cba4129cf775616f18528c48868f89d2aa54413a848d8d1ab9fa037e4d
3
+ size 336200444
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1138b1bd289f2299d4a727a1e1343ce601ecd6bb137d71d30bc4c07883e5d181
3
+ size 336200509
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
test_local.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import onnxruntime as rt
2
+
3
+ from sentence_transformers.util import cos_sim
4
+ from sentence_transformers import SentenceTransformer
5
+
6
+ import transformers
7
+
8
+ import gc
9
+ import json
10
+
11
+
12
+ with open('conversion_config.json') as json_file:
13
+ conversion_config = json.load(json_file)
14
+
15
+
16
+ model_id = conversion_config["model_id"]
17
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
18
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
19
+
20
+ sentences_1 = 'How is the weather today?'
21
+ sentences_2 = 'What is the current weather like today?'
22
+
23
+ print(f"Testing on cosine similiarity between sentences: \n'{sentences_1}'\n'{sentences_2}'\n\n\n")
24
+
25
+ tokenizer = transformers.AutoTokenizer.from_pretrained("./")
26
+ enc1 = tokenizer(sentences_1)
27
+ enc2 = tokenizer(sentences_2)
28
+
29
+ for precision, file_name in precision_to_filename_map.items():
30
+
31
+
32
+ onnx_session = rt.InferenceSession(file_name)
33
+ embeddings_1_onnx = onnx_session.run(None, {"input_ids": [enc1.input_ids],
34
+ "attention_mask": [enc1.attention_mask]})[1][0]
35
+
36
+ embeddings_2_onnx = onnx_session.run(None, {"input_ids": [enc2.input_ids],
37
+ "attention_mask": [enc2.attention_mask]})[1][0]
38
+
39
+ del onnx_session
40
+ gc.collect()
41
+ print(f'Cosine similiarity for ONNX model with precision "{precision}" is {str(cos_sim(embeddings_1_onnx, embeddings_2_onnx))}')
42
+
43
+
44
+
45
+
46
+ model = SentenceTransformer(model_id, trust_remote_code=True)
47
+ embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True)
48
+ embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True)
49
+ print('Cosine similiarity for original sentence transformer model is '+str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer)))
test_teradata.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import teradataml as tdml
3
+ from tabulate import tabulate
4
+
5
+ import json
6
+
7
+
8
+ with open('conversion_config.json') as json_file:
9
+ conversion_config = json.load(json_file)
10
+
11
+
12
+ model_id = conversion_config["model_id"]
13
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
14
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
15
+
16
+ host = sys.argv[1]
17
+ username = sys.argv[2]
18
+ password = sys.argv[3]
19
+
20
+ print("Setting up connection to teradata...")
21
+ tdml.create_context(host = host, username = username, password = password)
22
+ print("Done\n\n")
23
+
24
+
25
+ print("Deploying tokenizer...")
26
+ try:
27
+ tdml.db_drop_table('tokenizer_table')
28
+ except:
29
+ print("Can't drop tokenizers table - it's not existing")
30
+ tdml.save_byom('tokenizer',
31
+ 'tokenizer.json',
32
+ 'tokenizer_table')
33
+ print("Done\n\n")
34
+
35
+ print("Testing models...")
36
+ try:
37
+ tdml.db_drop_table('model_table')
38
+ except:
39
+ print("Can't drop models table - it's not existing")
40
+
41
+ for precision, file_name in precision_to_filename_map.items():
42
+ print(f"Deploying {precision} model...")
43
+ tdml.save_byom(precision,
44
+ file_name,
45
+ 'model_table')
46
+ print(f"Model {precision} is deployed\n")
47
+
48
+ print(f"Calculating embeddings with {precision} model...")
49
+ try:
50
+ tdml.db_drop_table('emails_embeddings_store')
51
+ except:
52
+ print("Can't drop embeddings table - it's not existing")
53
+
54
+ tdml.execute_sql(f"""
55
+ create volatile table emails_embeddings_store as (
56
+ select
57
+ *
58
+ from mldb.ONNXEmbeddings(
59
+ on emails.emails as InputTable
60
+ on (select * from model_table where model_id = '{precision}') as ModelTable DIMENSION
61
+ on (select model as tokenizer from tokenizer_table where model_id = 'tokenizer') as TokenizerTable DIMENSION
62
+
63
+ using
64
+ Accumulate('id', 'txt')
65
+ ModelOutputTensor('sentence_embedding')
66
+ EnableMemoryCheck('false')
67
+ OutputFormat('FLOAT32({number_of_generated_embeddings})')
68
+ OverwriteCachedModel('true')
69
+ ) a
70
+ ) with data on commit preserve rows
71
+
72
+ """)
73
+ print("Embeddings calculated")
74
+ print(f"Testing semantic search with cosine similiarity on the output of the model with precision '{precision}'...")
75
+ tdf_embeddings_store = tdml.DataFrame('emails_embeddings_store')
76
+ tdf_embeddings_store_tgt = tdf_embeddings_store[tdf_embeddings_store.id == 3]
77
+
78
+ tdf_embeddings_store_ref = tdf_embeddings_store[tdf_embeddings_store.id != 3]
79
+
80
+ cos_sim_pd = tdml.DataFrame.from_query(f"""
81
+ SELECT
82
+ dt.target_id,
83
+ dt.reference_id,
84
+ e_tgt.txt as target_txt,
85
+ e_ref.txt as reference_txt,
86
+ (1.0 - dt.distance) as similiarity
87
+ FROM
88
+ TD_VECTORDISTANCE (
89
+ ON ({tdf_embeddings_store_tgt.show_query()}) AS TargetTable
90
+ ON ({tdf_embeddings_store_ref.show_query()}) AS ReferenceTable DIMENSION
91
+ USING
92
+ TargetIDColumn('id')
93
+ TargetFeatureColumns('[emb_0:emb_{number_of_generated_embeddings - 1}]')
94
+ RefIDColumn('id')
95
+ RefFeatureColumns('[emb_0:emb_{number_of_generated_embeddings - 1}]')
96
+ DistanceMeasure('cosine')
97
+ topk(3)
98
+ ) AS dt
99
+ JOIN emails.emails e_tgt on e_tgt.id = dt.target_id
100
+ JOIN emails.emails e_ref on e_ref.id = dt.reference_id;
101
+ """).to_pandas()
102
+ print(tabulate(cos_sim_pd, headers='keys', tablefmt='fancy_grid'))
103
+ print("Done\n\n")
104
+
105
+
106
+ tdml.remove_context()
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff