bbmb commited on
Commit
7bc0867
·
verified ·
1 Parent(s): 1e7cd4f

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,817 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/all-MiniLM-L6-v2
3
+ language:
4
+ - en
5
+ library_name: sentence-transformers
6
+ license: apache-2.0
7
+ metrics:
8
+ - cosine_accuracy@1
9
+ - cosine_accuracy@3
10
+ - cosine_accuracy@5
11
+ - cosine_accuracy@10
12
+ - cosine_precision@1
13
+ - cosine_precision@3
14
+ - cosine_precision@5
15
+ - cosine_precision@10
16
+ - cosine_recall@1
17
+ - cosine_recall@3
18
+ - cosine_recall@5
19
+ - cosine_recall@10
20
+ - cosine_ndcg@10
21
+ - cosine_mrr@10
22
+ - cosine_map@100
23
+ pipeline_tag: sentence-similarity
24
+ tags:
25
+ - sentence-transformers
26
+ - sentence-similarity
27
+ - feature-extraction
28
+ - generated_from_trainer
29
+ - dataset_size:4820
30
+ - loss:MatryoshkaLoss
31
+ - loss:MultipleNegativesRankingLoss
32
+ widget:
33
+ - source_sentence: "Defense \n11 January 2024 ATP 3-21.8 5-57\n Reaction to enemy\
34
+ \ fires (for example, artillery and/or aviation) and CBRN\nattacks.\n Reports\
35
+ \ to higher, monitoring stockage levels, and cross leveling or resupply.\n CASEVAC\
36
+ \ and MEDEVAC procedures.\n Criteria to commitment the reserve.\nFigure 5-13.\
37
+ \ Main battle area (platoon engagements), example \nFOLLOW THROUGH \n5-167. During\
38
+ \ the planning for the defensive operation, the platoon leader must discern \n\
39
+ from the company OPORD what the potential follow-on missions are and begin to\
40
+ \ plan\nhow to achieve them. During this planning , the leader determines the\
41
+ \ possible timeline\nand location for defeat in detail , consolidate, reorganize,\
42
+ \ and transition which best\nf\nacilitates future operations and provides adequate\
43
+ \ protection."
44
+ sentences:
45
+ - What are some methods for distributing fires effectively in a platoon?
46
+ - What should I consider when selecting battle positions for my unit?
47
+ - What key factors are involved in planning for a defense scenario?
48
+ - source_sentence: "Chapter 6 \n6-24 ATP 3-21.8 11 January 2024 \nFigure 6-7. Movement\
49
+ \ to maneuver \nTERRAIN \n6-67. Platoon and squads enhance their own security\
50
+ \ during movement using covered\nand concealed terrain ; the use of the appropriate\
51
+ \ movement formation and movement\ntechnique; the actions taken to secure danger\
52
+ \ areas during crossing ; the enforcement of\nnoise, light, and emissions control\
53
+ \ (for example, thermal and electronic) discipline; and\nus\ne of proper individual\
54
+ \ camouflage techniques. When planning and preparing for\nmovement, leaders must\
55
+ \ consider how terrain affects security while simultaneously\nconsidering METT-TC\
56
+ \ (I). Some missions may require the platoon or individual squad\nto move on other\
57
+ \ than covered and concealed routes. While leaders may not be able t o\npr\nevent\
58
+ \ the unit ’s detection, they can ensure it moves on the battlefield in a time\
59
+ \ a nd\npl\nace for which the enemy is unprepared. Particularly when moving in\
60
+ \ the open, leaders\nmust avoid predictability and continue to use terrain to\
61
+ \ their advantage.\nEXECUTION \n6-68. During execution, leaders enforce camouflage\
62
+ \ discipline (Soldiers and their\nequipment). Leaders ensure the camouflage used\
63
+ \ by their Soldiers is appropriate to the\nterrain and season. Platoon standard\
64
+ \ operating procedures ( SOPs) specify elements of\ncamouflage, noise and light\
65
+ \ discipline and emissions control; security halts; and actions\nat security halts.\n\
66
+ CAMOUFLAGE, NOISE, AND LIGHT DISCIPLINE AND EMISSIONS CONTROL \n6-69. The platoon\
67
+ \ is visible to enemy forces and target acquisition capabilities on every\nspectrum,\
68
+ \ including visible light, sound, and across the electromagnetic spectrum. The"
69
+ sentences:
70
+ - What is the process for preparing a machine gun range card?
71
+ - How does terrain impact the security of a platoon during movement?
72
+ - What’s the practical rate of fire for the M3 MAAWS?
73
+ - source_sentence: "Machine Gun Employment and Theory \n11 January 2024 ATP 3-21.8\
74
+ \ C-35\nSECURITY \nC-115. Security includes all command measures to protect against\
75
+ \ surprise ,\nobservation, and annoyance by the enemy. The gun team is responsible\
76
+ \ for its immediate \nlocal security, specifically provided by the assistant gunner\
77
+ \ and/or ammunition bearer\nfor close in local security to the gunner, who is\
78
+ \ fixated on deeper targets. Though the\nprincipal unit security measures against\
79
+ \ ground forces include employment of\nobservation posts, security patrols, and\
80
+ \ detachments covering the front flanks and rear\nof the unit’s most valuable\
81
+ \ weapons systems and vulnerable areas. The composition and\nstrength of these\
82
+ \ detachments depends on the size of the main body, its mission, and\nna\nture\
83
+ \ of the opposition expected. The presence of machine guns with security\ndetachments\
84
+ \ augments their firepower to delay, attack, and defend, by virtue of inherent\n\
85
+ firepower.\nC-116. The potential of air and ground attacks on the unit demands\
86
+ \ every possible\nprecaution for maximum security while on the move. Where this\
87
+ \ situation exists , the\nmachine gun crew must be thoroughly trained in the hasty\
88
+ \ delivery of antiaircraft fire\nand of counterfire against enemy ground forces.\
89
+ \ The distribution of the medium machine \nguns in the formation is critical.\
90
+ \ The medium machine gun crew is constantly on the\nalert, particularly at halts\
91
+ \ , ready to deliver fire as soon as possible. If leaders expect a\nhalt to exceed\
92
+ \ a brief period , they carefully choose medium machine gun positions to\navoid\
93
+ \ unduly tiring the medium machine gun crew. If they expect the halt to extend\
94
+ \ for\na long period, they can have the medium machine gun crew take up positions\
95
+ \ in support\nof the unit. The crew covers the direction from which they expect\
96
+ \ enemy activity as well\nas the direction from which the unit came. Leaders select\
97
+ \ positions permitting the\ndelivery of fire in the most probable direction of\
98
+ \ enemy attack, such as valleys, draws,\nridges, and spurs. They choose positions\
99
+ \ offering obstructed fire from potential enemy\nlocations.\nEMPLOYMENT OF FIRE\
100
+ \ AND MOVEMENT \nC-117. The employment of fire and movement is essential and greatly\
101
+ \ depends upon\nthe other during maneuver. Without the support of covering fires\
102
+ \ , maneuvering in the\npresence of enemy fire can result in disastrous losses.\
103
+ \ Covering fires , especially\nproviding fire superiority, allow maneuvering in\
104
+ \ the offense. However , fire superiority\nalone rarely wins battles. The primary\
105
+ \ objective of the offense is to advance , occupy,\nand hold the enemy position.\n\
106
+ Machine Gun as a Base of Fire \nC-118. Machine gun fire from a support by fire\
107
+ \ position must be the minimum possible\nto keep the enemy from returning fire.\
108
+ \ Ammunition must be conserved so the guns do\nnot run out of ammunition.\nC-119.\
109
+ \ The weapons squad leader positions and controls the fires of all medium\nmachine\
110
+ \ guns in the element. Machine gun targets include essential enemy weapons or\n\
111
+ groups of enemy targets either on the objective or attempting to reinforce or\n\
112
+ counterattack. In terms of engagement ranges, medium machine guns in the base-of-fire"
113
+ sentences:
114
+ - How do observation posts aid in machine gun unit security?
115
+ - Why would a unit choose to defend on a reverse slope rather than a forward slope?
116
+ - What does the publication say about engagement area development for defense?
117
+ - source_sentence: "Chapter 7 \n7-18 ATP 3-21.8 11 January 2024 \nThe PACE plan is\
118
+ \ a communication plan that exists for a specific mission or task, not a \nspecific\
119
+ \ unit, as the plan considers both intra- and inter-unit sharing of information.\
120
+ \ The \nPACE plan designates the order in which an element will move through available\
121
+ \ \ncommunications systems until contact can be established with the desired distant\
122
+ \ \nelement. \nCHALLENGE AND PASSWORD OUTSIDE OF FRIENDLY LINES \n7-61. The challenge\
123
+ \ and password from the signal operating instructions must not be\nused when the\
124
+ \ patrol is outside friendly lines. The unit ’s tactical SOP should state the\n\
125
+ procedure for establishing a patrol challenge and password as well as other combat\n\
126
+ identification features and patrol markings. Two methods for establishing a challenge\n\
127
+ and password are the odd number system and running password.\nOdd Number System\
128
+ \ \n7-62. The leader specifies an odd number. The challenge can be any number\
129
+ \ less than\nthe specified number. The password will be the number that must be\
130
+ \ added to it to equal\nthe specified number, for example, the number is 9, the\
131
+ \ challenge is 4, and the password\nis 5.\nRunning Password \n7-63. Signal operating\
132
+ \ instructions also may designate a running password. This code\nword alerts a\
133
+ \ unit that friendly are approaching in a less than organized manner and\npossibly\
134
+ \ under pressure. The number of friendly approaching follows the running\npassword.\
135
+ \ For example, if the running password is “eagle,” and seven friendl ies are\n\
136
+ a\npproaching, they would say “eagle seven.”\nLOCATIONS OF KEY LEADERS \n7-64.\
137
+ \ The patrol leader considers where best to locate throughout each phase of the\n\
138
+ patrol, and where to locate the APL , and other essential leaders for each phase\
139
+ \ of the\npatrol. The APL normally is with the following elements for each type\
140
+ \ of patrol:\n On a raid or ambush, the APL can be with the patrol leader on\
141
+ \ the objective\nor control the support element from the support position.\n\
142
+ \ On an area reconnaissance, the APL can move with one of the area\nreconnaissance\
143
+ \ elements or supervise security in the ORP.\n On a zone reconnaissance, the\
144
+ \ APL can move with one of the zone\nreconnaissance elements or move with the\
145
+ \ reconnaissance element setting up\nthe linkup point.\nACTIONS ON CHANCE CONTACT\
146
+ \ \n7-65. The leader ’s plan must address actions on chance contact at each phase\
147
+ \ of the\npatrol. (See paragraphs 2-48 to 2-52 for additional information on actions\
148
+ \ on contact.)\nFor the patrol’s mission the plan must address—"
149
+ sentences:
150
+ - How does a platoon deal with obstacles during an assault?
151
+ - What are some methods for setting up a challenge and password in the field?
152
+ - What is the purpose of having one squad engage while others observe in an observed
153
+ fire scenario?
154
+ - source_sentence: "Offense \n11 January 2024 ATP 3-21.8 4-61\nlight the target, making\
155
+ \ it easier to acquire effectively. Leaders and Soldiers \nuse the infrared devices\
156
+ \ to identify enemy or friendly personnel and then \nengage targets using their\
157
+ \ aiming lights. \n4-172. Illuminating rounds fired to burn on the ground can\
158
+ \ mark objectives. This helps\nthe platoon orient on the objective but may adversely\
159
+ \ affect night vision devices.\n4-173. Leaders plan but do not always use illumination\
160
+ \ during limited visibility\nattacks. Battalion commanders normally control conventional\
161
+ \ illumination but ma y\na\nuthorize the company commander to do so. If the commander\
162
+ \ decides to use\nconventional illumination , the commander should not call for\
163
+ \ it until the assault is\ninitiated or the attack is detected. It should be placed\
164
+ \ on several locations over a wide\narea to confuse the enemy as to the exact\
165
+ \ place of the attack. It should be placed beyond\nthe objective to help assaulting\
166
+ \ Soldiers see and fire at withdrawing or counterattacking\nenemy Soldiers. Infrared\
167
+ \ illumination is a good capability to light the objective without\nlighting it\
168
+ \ for enemy forces without night vision devices. This advantage is degraded\n\
169
+ when used against a peer threat with the same night vision capabilities.\n4-174.\
170
+ \ The platoon leader , squad leaders , and vehicle commanders must know unit\n\
171
+ tactical SOP and develop sound COAs to synchronize the employment of infrared\n\
172
+ illumination devices , target designators , and aiming lights during their assault\
173
+ \ on the\nobjective. These include using luminous tape or chemical lights to mark\
174
+ \ personnel and\nusing weapons control restrictions.\n4-175. The platoon leader\
175
+ \ may use the following techniques to increase control during\nthe assault:\n\
176
+  Use no flares, grenades, or obscuration on the objective.\n Use mortar or artillery\
177
+ \ rounds to orient attacking units.\n Use a base squad or fire team to pace and\
178
+ \ guide others.\n Reduce intervals between Soldiers and squads.\n4-176. Like\
179
+ \ a daylight attack , indirect and direct fires are planned for a limited\nvisibility\
180
+ \ attack but are not executed unless the platoon is detected or is ready to assault.\n\
181
+ Some weapons may fire before the attack and maintain a pattern to deceive the\
182
+ \ enemy\nor to help cover noise ma de by the platoon ’s movement. This is not\
183
+ \ done if it will\ndisclose the attack.\n4-177. Obscuration further reduces the\
184
+ \ enemy’s visibility, particularly if the enemy has\nnight vision devices. The\
185
+ \ FO fires obscuration rounds close to or on enemy positions ,\nso it does not\
186
+ \ restrict friendly movement or hinder the reduction of obstacles. Employing \n\
187
+ obscuration on the objective during the assault may make it hard for assaulting\
188
+ \ Soldiers\nto find enemy fighting positions. If enough thermal sights are available\
189
+ \ , obscuration on\nthe objective may provide a decisive advantage for a well-trained\
190
+ \ platoon.\nNote. I f the enemy is equipped with night vision devices , leaders\
191
+ \ must evaluate \nthe risk of using each technique and ensure the mission is not\
192
+ \ compromised by \nthe enemy’s ability to detect infrared light sources."
193
+ sentences:
194
+ - Can obscurants be used to hamper enemy fire support? How?
195
+ - How can leaders effectively provide command and control during defensive operations?
196
+ - What are the advantages of using infrared illumination in assaults?
197
+ model-index:
198
+ - name: deep learning project 2
199
+ results:
200
+ - task:
201
+ type: information-retrieval
202
+ name: Information Retrieval
203
+ dataset:
204
+ name: dim 384
205
+ type: dim_384
206
+ metrics:
207
+ - type: cosine_accuracy@1
208
+ value: 0.0037313432835820895
209
+ name: Cosine Accuracy@1
210
+ - type: cosine_accuracy@3
211
+ value: 0.013059701492537313
212
+ name: Cosine Accuracy@3
213
+ - type: cosine_accuracy@5
214
+ value: 0.048507462686567165
215
+ name: Cosine Accuracy@5
216
+ - type: cosine_accuracy@10
217
+ value: 0.4496268656716418
218
+ name: Cosine Accuracy@10
219
+ - type: cosine_precision@1
220
+ value: 0.0037313432835820895
221
+ name: Cosine Precision@1
222
+ - type: cosine_precision@3
223
+ value: 0.00435323383084577
224
+ name: Cosine Precision@3
225
+ - type: cosine_precision@5
226
+ value: 0.009701492537313432
227
+ name: Cosine Precision@5
228
+ - type: cosine_precision@10
229
+ value: 0.04496268656716418
230
+ name: Cosine Precision@10
231
+ - type: cosine_recall@1
232
+ value: 0.0037313432835820895
233
+ name: Cosine Recall@1
234
+ - type: cosine_recall@3
235
+ value: 0.013059701492537313
236
+ name: Cosine Recall@3
237
+ - type: cosine_recall@5
238
+ value: 0.048507462686567165
239
+ name: Cosine Recall@5
240
+ - type: cosine_recall@10
241
+ value: 0.4496268656716418
242
+ name: Cosine Recall@10
243
+ - type: cosine_ndcg@10
244
+ value: 0.15012636108139818
245
+ name: Cosine Ndcg@10
246
+ - type: cosine_mrr@10
247
+ value: 0.06590188936271034
248
+ name: Cosine Mrr@10
249
+ - type: cosine_map@100
250
+ value: 0.08623119999483674
251
+ name: Cosine Map@100
252
+ - task:
253
+ type: information-retrieval
254
+ name: Information Retrieval
255
+ dataset:
256
+ name: dim 256
257
+ type: dim_256
258
+ metrics:
259
+ - type: cosine_accuracy@1
260
+ value: 0.0037313432835820895
261
+ name: Cosine Accuracy@1
262
+ - type: cosine_accuracy@3
263
+ value: 0.011194029850746268
264
+ name: Cosine Accuracy@3
265
+ - type: cosine_accuracy@5
266
+ value: 0.03731343283582089
267
+ name: Cosine Accuracy@5
268
+ - type: cosine_accuracy@10
269
+ value: 0.4458955223880597
270
+ name: Cosine Accuracy@10
271
+ - type: cosine_precision@1
272
+ value: 0.0037313432835820895
273
+ name: Cosine Precision@1
274
+ - type: cosine_precision@3
275
+ value: 0.003731343283582089
276
+ name: Cosine Precision@3
277
+ - type: cosine_precision@5
278
+ value: 0.007462686567164179
279
+ name: Cosine Precision@5
280
+ - type: cosine_precision@10
281
+ value: 0.04458955223880597
282
+ name: Cosine Precision@10
283
+ - type: cosine_recall@1
284
+ value: 0.0037313432835820895
285
+ name: Cosine Recall@1
286
+ - type: cosine_recall@3
287
+ value: 0.011194029850746268
288
+ name: Cosine Recall@3
289
+ - type: cosine_recall@5
290
+ value: 0.03731343283582089
291
+ name: Cosine Recall@5
292
+ - type: cosine_recall@10
293
+ value: 0.4458955223880597
294
+ name: Cosine Recall@10
295
+ - type: cosine_ndcg@10
296
+ value: 0.14887734118005805
297
+ name: Cosine Ndcg@10
298
+ - type: cosine_mrr@10
299
+ value: 0.06525334636342103
300
+ name: Cosine Mrr@10
301
+ - type: cosine_map@100
302
+ value: 0.08587360417470279
303
+ name: Cosine Map@100
304
+ - task:
305
+ type: information-retrieval
306
+ name: Information Retrieval
307
+ dataset:
308
+ name: dim 128
309
+ type: dim_128
310
+ metrics:
311
+ - type: cosine_accuracy@1
312
+ value: 0.0037313432835820895
313
+ name: Cosine Accuracy@1
314
+ - type: cosine_accuracy@3
315
+ value: 0.009328358208955223
316
+ name: Cosine Accuracy@3
317
+ - type: cosine_accuracy@5
318
+ value: 0.04664179104477612
319
+ name: Cosine Accuracy@5
320
+ - type: cosine_accuracy@10
321
+ value: 0.43656716417910446
322
+ name: Cosine Accuracy@10
323
+ - type: cosine_precision@1
324
+ value: 0.0037313432835820895
325
+ name: Cosine Precision@1
326
+ - type: cosine_precision@3
327
+ value: 0.0031094527363184077
328
+ name: Cosine Precision@3
329
+ - type: cosine_precision@5
330
+ value: 0.009328358208955225
331
+ name: Cosine Precision@5
332
+ - type: cosine_precision@10
333
+ value: 0.043656716417910454
334
+ name: Cosine Precision@10
335
+ - type: cosine_recall@1
336
+ value: 0.0037313432835820895
337
+ name: Cosine Recall@1
338
+ - type: cosine_recall@3
339
+ value: 0.009328358208955223
340
+ name: Cosine Recall@3
341
+ - type: cosine_recall@5
342
+ value: 0.04664179104477612
343
+ name: Cosine Recall@5
344
+ - type: cosine_recall@10
345
+ value: 0.43656716417910446
346
+ name: Cosine Recall@10
347
+ - type: cosine_ndcg@10
348
+ value: 0.14645163034094227
349
+ name: Cosine Ndcg@10
350
+ - type: cosine_mrr@10
351
+ value: 0.06459073679222935
352
+ name: Cosine Mrr@10
353
+ - type: cosine_map@100
354
+ value: 0.08473376158047675
355
+ name: Cosine Map@100
356
+ - task:
357
+ type: information-retrieval
358
+ name: Information Retrieval
359
+ dataset:
360
+ name: dim 64
361
+ type: dim_64
362
+ metrics:
363
+ - type: cosine_accuracy@1
364
+ value: 0.0018656716417910447
365
+ name: Cosine Accuracy@1
366
+ - type: cosine_accuracy@3
367
+ value: 0.007462686567164179
368
+ name: Cosine Accuracy@3
369
+ - type: cosine_accuracy@5
370
+ value: 0.04291044776119403
371
+ name: Cosine Accuracy@5
372
+ - type: cosine_accuracy@10
373
+ value: 0.4216417910447761
374
+ name: Cosine Accuracy@10
375
+ - type: cosine_precision@1
376
+ value: 0.0018656716417910447
377
+ name: Cosine Precision@1
378
+ - type: cosine_precision@3
379
+ value: 0.0024875621890547263
380
+ name: Cosine Precision@3
381
+ - type: cosine_precision@5
382
+ value: 0.008582089552238806
383
+ name: Cosine Precision@5
384
+ - type: cosine_precision@10
385
+ value: 0.04216417910447762
386
+ name: Cosine Precision@10
387
+ - type: cosine_recall@1
388
+ value: 0.0018656716417910447
389
+ name: Cosine Recall@1
390
+ - type: cosine_recall@3
391
+ value: 0.007462686567164179
392
+ name: Cosine Recall@3
393
+ - type: cosine_recall@5
394
+ value: 0.04291044776119403
395
+ name: Cosine Recall@5
396
+ - type: cosine_recall@10
397
+ value: 0.4216417910447761
398
+ name: Cosine Recall@10
399
+ - type: cosine_ndcg@10
400
+ value: 0.13895211086835252
401
+ name: Cosine Ndcg@10
402
+ - type: cosine_mrr@10
403
+ value: 0.05946680289031035
404
+ name: Cosine Mrr@10
405
+ - type: cosine_map@100
406
+ value: 0.07896404458930699
407
+ name: Cosine Map@100
408
+ ---
409
+
410
+ # deep learning project 2
411
+
412
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) on the json dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
413
+
414
+ ## Model Details
415
+
416
+ ### Model Description
417
+ - **Model Type:** Sentence Transformer
418
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision fa97f6e7cb1a59073dff9e6b13e2715cf7475ac9 -->
419
+ - **Maximum Sequence Length:** 256 tokens
420
+ - **Output Dimensionality:** 384 dimensions
421
+ - **Similarity Function:** Cosine Similarity
422
+ - **Training Dataset:**
423
+ - json
424
+ - **Language:** en
425
+ - **License:** apache-2.0
426
+
427
+ ### Model Sources
428
+
429
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
430
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
431
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
432
+
433
+ ### Full Model Architecture
434
+
435
+ ```
436
+ SentenceTransformer(
437
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
438
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
439
+ (2): Normalize()
440
+ )
441
+ ```
442
+
443
+ ## Usage
444
+
445
+ ### Direct Usage (Sentence Transformers)
446
+
447
+ First install the Sentence Transformers library:
448
+
449
+ ```bash
450
+ pip install -U sentence-transformers
451
+ ```
452
+
453
+ Then you can load this model and run inference.
454
+ ```python
455
+ from sentence_transformers import SentenceTransformer
456
+
457
+ # Download from the 🤗 Hub
458
+ model = SentenceTransformer("bbmb/deep-learning-for-embedding-model-ssilwal-qpham6_army_doc")
459
+ # Run inference
460
+ sentences = [
461
+ 'Offense \n11 January 2024 ATP 3-21.8 4-61\nlight the target, making it easier to acquire effectively. Leaders and Soldiers \nuse the infrared devices to identify enemy or friendly personnel and then \nengage targets using their aiming lights. \n4-172. Illuminating rounds fired to burn on the ground can mark objectives. This helps\nthe platoon orient on the objective but may adversely affect night vision devices.\n4-173. Leaders plan but do not always use illumination during limited visibility\nattacks. Battalion commanders normally control conventional illumination but ma y\na\nuthorize the company commander to do so. If the commander decides to use\nconventional illumination , the commander should not call for it until the assault is\ninitiated or the attack is detected. It should be placed on several locations over a wide\narea to confuse the enemy as to the exact place of the attack. It should be placed beyond\nthe objective to help assaulting Soldiers see and fire at withdrawing or counterattacking\nenemy Soldiers. Infrared illumination is a good capability to light the objective without\nlighting it for enemy forces without night vision devices. This advantage is degraded\nwhen used against a peer threat with the same night vision capabilities.\n4-174. The platoon leader , squad leaders , and vehicle commanders must know unit\ntactical SOP and develop sound COAs to synchronize the employment of infrared\nillumination devices , target designators , and aiming lights during their assault on the\nobjective. These include using luminous tape or chemical lights to mark personnel and\nusing weapons control restrictions.\n4-175. The platoon leader may use the following techniques to increase control during\nthe assault:\n\uf06c Use no flares, grenades, or obscuration on the objective.\n\uf06c Use mortar or artillery rounds to orient attacking units.\n\uf06c Use a base squad or fire team to pace and guide others.\n\uf06c Reduce intervals between Soldiers and squads.\n4-176. Like a daylight attack , indirect and direct fires are planned for a limited\nvisibility attack but are not executed unless the platoon is detected or is ready to assault.\nSome weapons may fire before the attack and maintain a pattern to deceive the enemy\nor to help cover noise ma de by the platoon ’s movement. This is not done if it will\ndisclose the attack.\n4-177. Obscuration further reduces the enemy’s visibility, particularly if the enemy has\nnight vision devices. The FO fires obscuration rounds close to or on enemy positions ,\nso it does not restrict friendly movement or hinder the reduction of obstacles. Employing \nobscuration on the objective during the assault may make it hard for assaulting Soldiers\nto find enemy fighting positions. If enough thermal sights are available , obscuration on\nthe objective may provide a decisive advantage for a well-trained platoon.\nNote. I f the enemy is equipped with night vision devices , leaders must evaluate \nthe risk of using each technique and ensure the mission is not compromised by \nthe enemy’s ability to detect infrared light sources.',
462
+ 'What are the advantages of using infrared illumination in assaults?',
463
+ 'How can leaders effectively provide command and control during defensive operations?',
464
+ ]
465
+ embeddings = model.encode(sentences)
466
+ print(embeddings.shape)
467
+ # [3, 384]
468
+
469
+ # Get the similarity scores for the embeddings
470
+ similarities = model.similarity(embeddings, embeddings)
471
+ print(similarities.shape)
472
+ # [3, 3]
473
+ ```
474
+
475
+ <!--
476
+ ### Direct Usage (Transformers)
477
+
478
+ <details><summary>Click to see the direct usage in Transformers</summary>
479
+
480
+ </details>
481
+ -->
482
+
483
+ <!--
484
+ ### Downstream Usage (Sentence Transformers)
485
+
486
+ You can finetune this model on your own dataset.
487
+
488
+ <details><summary>Click to expand</summary>
489
+
490
+ </details>
491
+ -->
492
+
493
+ <!--
494
+ ### Out-of-Scope Use
495
+
496
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
497
+ -->
498
+
499
+ ## Evaluation
500
+
501
+ ### Metrics
502
+
503
+ #### Information Retrieval
504
+
505
+ * Datasets: `dim_384`, `dim_256`, `dim_128` and `dim_64`
506
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
507
+
508
+ | Metric | dim_384 | dim_256 | dim_128 | dim_64 |
509
+ |:--------------------|:-----------|:-----------|:-----------|:----------|
510
+ | cosine_accuracy@1 | 0.0037 | 0.0037 | 0.0037 | 0.0019 |
511
+ | cosine_accuracy@3 | 0.0131 | 0.0112 | 0.0093 | 0.0075 |
512
+ | cosine_accuracy@5 | 0.0485 | 0.0373 | 0.0466 | 0.0429 |
513
+ | cosine_accuracy@10 | 0.4496 | 0.4459 | 0.4366 | 0.4216 |
514
+ | cosine_precision@1 | 0.0037 | 0.0037 | 0.0037 | 0.0019 |
515
+ | cosine_precision@3 | 0.0044 | 0.0037 | 0.0031 | 0.0025 |
516
+ | cosine_precision@5 | 0.0097 | 0.0075 | 0.0093 | 0.0086 |
517
+ | cosine_precision@10 | 0.045 | 0.0446 | 0.0437 | 0.0422 |
518
+ | cosine_recall@1 | 0.0037 | 0.0037 | 0.0037 | 0.0019 |
519
+ | cosine_recall@3 | 0.0131 | 0.0112 | 0.0093 | 0.0075 |
520
+ | cosine_recall@5 | 0.0485 | 0.0373 | 0.0466 | 0.0429 |
521
+ | cosine_recall@10 | 0.4496 | 0.4459 | 0.4366 | 0.4216 |
522
+ | **cosine_ndcg@10** | **0.1501** | **0.1489** | **0.1465** | **0.139** |
523
+ | cosine_mrr@10 | 0.0659 | 0.0653 | 0.0646 | 0.0595 |
524
+ | cosine_map@100 | 0.0862 | 0.0859 | 0.0847 | 0.079 |
525
+
526
+ <!--
527
+ ## Bias, Risks and Limitations
528
+
529
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
530
+ -->
531
+
532
+ <!--
533
+ ### Recommendations
534
+
535
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
536
+ -->
537
+
538
+ ## Training Details
539
+
540
+ ### Training Dataset
541
+
542
+ #### json
543
+
544
+ * Dataset: json
545
+ * Size: 4,820 training samples
546
+ * Columns: <code>positive</code> and <code>anchor</code>
547
+ * Approximate statistics based on the first 1000 samples:
548
+ | | positive | anchor |
549
+ |:--------|:--------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
550
+ | type | string | string |
551
+ | details | <ul><li>min: 100 tokens</li><li>mean: 248.18 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 15.06 tokens</li><li>max: 27 tokens</li></ul> |
552
+ * Samples:
553
+ | positive | anchor |
554
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------|
555
+ | <code>Appendix A <br>A-22 ATP 3-21.8 11 January 2024 <br>A-68. Observed fire. Usually is used when the platoon is in protected defensive positions <br>with engagement ranges more than 2,500 meters for stabilized systems (when attached)<br>and 1,500 meters for unstabilized systems. It can be employed between elements of the<br>platoon, such as the squad lasing and observing while the weapons squad engages. The<br>platoon leader directs one squad to engage. The remaining squads observe fires and<br>prepare to engage on order in case the engaging element consistently misses its targets ,<br>experiences a malfunction, or runs low on ammunition. Observed fire allows for mutual<br>observation and assistance while protecting the location of the observing elements.<br>A-69. Sequential fire. Entails the subordinate elements of a unit engaging the same point <br>or area target one after another in an arranged sequence. Sequential fire also can help to<br>prevent the waste of ammunition, as when a platoon waits to see the effects of the ...</code> | <code>What is the purpose of having one squad engage while others observe in an observed fire scenario?</code> |
556
+ | <code>Glossary <br>Glossary-4 ATP 3-21.8 11 January 2024 <br>PLD probable line of deployment <br>PPEP personal protective equipment posture <br>RFL restrictive fire line <br>RM risk management <br>ROE rules of engagement <br>RS reduced sensitivity <br>RTO radiotelephone operator <br>S-2 battalion or brigade intelligence staff officer <br>SALUTE size, activity, location, unit, time, and equipment <br>SDM squad-designated marksman <br>SITEMP situation template <br>SLM shoulder-launched munition <br>SOP standard operating procedure <br>STP Soldier training publication <br>TAA tactical assembly area <br>TC training circular <br>TCCC tactical combat casualty care <br>TLP troop leading procedures <br>TM technical manual <br>TRP target reference point <br>U.S. United States <br>WARNORD warning order <br>WCS weapons control status <br>WP white phosphorous <br>SECTION II – TERMS <br>actions on contact <br>A process to help leaders understand what is happening and to take action. <br>(FM 3-90) <br>air-ground operations <br>The simultaneous or synchronized employment of ground forces with avi...</code> | <code>How is the term SDM used in the military?</code> |
557
+ | <code>Chapter 1 <br>1-2 ATP 3-21.8 11 January 2024 <br>MISSION, CAPABILITIES, AND LIMITATIONS <br>1-2. The mission of the Infantry rifle platoon is to close with the enemy using fire and<br>movement to destroy or capture enemy forces , or to repel enemy attacks by fire , close<br>co<br>mbat, and counterattack to control land areas , including populations and resources.<br>The Infantry rifle platoon leader exercises command and control and directs the<br>operation of the platoon and attached units while conducting combined arms warfare<br>throughout the depth of the platoon’s area of operations (AO). Platoon missions ,<br>although not inclusive, may include reducing fortified areas , infiltrating and seizing<br>objectives in the enemy’ s rear, eliminating enemy force remnants in restricted terrain ,<br>securing key facilities and activities, and conducting operations in support of stability<br>operations tasks in the wake of maneuvering forces. Reconnaissance and surveillance<br>operations and security operations remain a core compe...</code> | <code>What offensive and defensive actions can an Infantry rifle platoon perform?</code> |
558
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
559
+ ```json
560
+ {
561
+ "loss": "MultipleNegativesRankingLoss",
562
+ "matryoshka_dims": [
563
+ 384,
564
+ 256,
565
+ 128,
566
+ 64
567
+ ],
568
+ "matryoshka_weights": [
569
+ 1,
570
+ 1,
571
+ 1,
572
+ 1
573
+ ],
574
+ "n_dims_per_step": -1
575
+ }
576
+ ```
577
+
578
+ ### Training Hyperparameters
579
+ #### Non-Default Hyperparameters
580
+
581
+ - `eval_strategy`: epoch
582
+ - `per_device_train_batch_size`: 64
583
+ - `per_device_eval_batch_size`: 16
584
+ - `gradient_accumulation_steps`: 8
585
+ - `num_train_epochs`: 20
586
+ - `lr_scheduler_type`: cosine
587
+ - `warmup_ratio`: 0.2
588
+ - `bf16`: True
589
+ - `tf32`: True
590
+ - `load_best_model_at_end`: True
591
+ - `optim`: adamw_torch_fused
592
+ - `batch_sampler`: no_duplicates
593
+
594
+ #### All Hyperparameters
595
+ <details><summary>Click to expand</summary>
596
+
597
+ - `overwrite_output_dir`: False
598
+ - `do_predict`: False
599
+ - `eval_strategy`: epoch
600
+ - `prediction_loss_only`: True
601
+ - `per_device_train_batch_size`: 64
602
+ - `per_device_eval_batch_size`: 16
603
+ - `per_gpu_train_batch_size`: None
604
+ - `per_gpu_eval_batch_size`: None
605
+ - `gradient_accumulation_steps`: 8
606
+ - `eval_accumulation_steps`: None
607
+ - `learning_rate`: 5e-05
608
+ - `weight_decay`: 0.0
609
+ - `adam_beta1`: 0.9
610
+ - `adam_beta2`: 0.999
611
+ - `adam_epsilon`: 1e-08
612
+ - `max_grad_norm`: 1.0
613
+ - `num_train_epochs`: 20
614
+ - `max_steps`: -1
615
+ - `lr_scheduler_type`: cosine
616
+ - `lr_scheduler_kwargs`: {}
617
+ - `warmup_ratio`: 0.2
618
+ - `warmup_steps`: 0
619
+ - `log_level`: passive
620
+ - `log_level_replica`: warning
621
+ - `log_on_each_node`: True
622
+ - `logging_nan_inf_filter`: True
623
+ - `save_safetensors`: True
624
+ - `save_on_each_node`: False
625
+ - `save_only_model`: False
626
+ - `restore_callback_states_from_checkpoint`: False
627
+ - `no_cuda`: False
628
+ - `use_cpu`: False
629
+ - `use_mps_device`: False
630
+ - `seed`: 42
631
+ - `data_seed`: None
632
+ - `jit_mode_eval`: False
633
+ - `use_ipex`: False
634
+ - `bf16`: True
635
+ - `fp16`: False
636
+ - `fp16_opt_level`: O1
637
+ - `half_precision_backend`: auto
638
+ - `bf16_full_eval`: False
639
+ - `fp16_full_eval`: False
640
+ - `tf32`: True
641
+ - `local_rank`: 0
642
+ - `ddp_backend`: None
643
+ - `tpu_num_cores`: None
644
+ - `tpu_metrics_debug`: False
645
+ - `debug`: []
646
+ - `dataloader_drop_last`: False
647
+ - `dataloader_num_workers`: 0
648
+ - `dataloader_prefetch_factor`: None
649
+ - `past_index`: -1
650
+ - `disable_tqdm`: False
651
+ - `remove_unused_columns`: True
652
+ - `label_names`: None
653
+ - `load_best_model_at_end`: True
654
+ - `ignore_data_skip`: False
655
+ - `fsdp`: []
656
+ - `fsdp_min_num_params`: 0
657
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
658
+ - `fsdp_transformer_layer_cls_to_wrap`: None
659
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
660
+ - `deepspeed`: None
661
+ - `label_smoothing_factor`: 0.0
662
+ - `optim`: adamw_torch_fused
663
+ - `optim_args`: None
664
+ - `adafactor`: False
665
+ - `group_by_length`: False
666
+ - `length_column_name`: length
667
+ - `ddp_find_unused_parameters`: None
668
+ - `ddp_bucket_cap_mb`: None
669
+ - `ddp_broadcast_buffers`: False
670
+ - `dataloader_pin_memory`: True
671
+ - `dataloader_persistent_workers`: False
672
+ - `skip_memory_metrics`: True
673
+ - `use_legacy_prediction_loop`: False
674
+ - `push_to_hub`: False
675
+ - `resume_from_checkpoint`: None
676
+ - `hub_model_id`: None
677
+ - `hub_strategy`: every_save
678
+ - `hub_private_repo`: False
679
+ - `hub_always_push`: False
680
+ - `gradient_checkpointing`: False
681
+ - `gradient_checkpointing_kwargs`: None
682
+ - `include_inputs_for_metrics`: False
683
+ - `eval_do_concat_batches`: True
684
+ - `fp16_backend`: auto
685
+ - `push_to_hub_model_id`: None
686
+ - `push_to_hub_organization`: None
687
+ - `mp_parameters`:
688
+ - `auto_find_batch_size`: False
689
+ - `full_determinism`: False
690
+ - `torchdynamo`: None
691
+ - `ray_scope`: last
692
+ - `ddp_timeout`: 1800
693
+ - `torch_compile`: False
694
+ - `torch_compile_backend`: None
695
+ - `torch_compile_mode`: None
696
+ - `dispatch_batches`: None
697
+ - `split_batches`: None
698
+ - `include_tokens_per_second`: False
699
+ - `include_num_input_tokens_seen`: False
700
+ - `neftune_noise_alpha`: None
701
+ - `optim_target_modules`: None
702
+ - `batch_eval_metrics`: False
703
+ - `prompts`: None
704
+ - `batch_sampler`: no_duplicates
705
+ - `multi_dataset_batch_sampler`: proportional
706
+
707
+ </details>
708
+
709
+ ### Training Logs
710
+ | Epoch | Step | Training Loss | dim_384_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
711
+ |:--------:|:-------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
712
+ | 0.9474 | 9 | - | 0.1225 | 0.1221 | 0.1145 | 0.0915 |
713
+ | 1.0526 | 10 | 7.2521 | - | - | - | - |
714
+ | 2.0 | 19 | - | 0.1296 | 0.1261 | 0.1157 | 0.1089 |
715
+ | 2.1053 | 20 | 5.4977 | - | - | - | - |
716
+ | 2.9474 | 28 | - | 0.1294 | 0.1377 | 0.1262 | 0.1090 |
717
+ | 3.1579 | 30 | 4.3477 | - | - | - | - |
718
+ | 4.0 | 38 | - | 0.1330 | 0.1378 | 0.1260 | 0.1126 |
719
+ | 4.2105 | 40 | 3.3767 | - | - | - | - |
720
+ | 4.9474 | 47 | - | 0.1415 | 0.1388 | 0.1294 | 0.1221 |
721
+ | 5.2632 | 50 | 2.6443 | - | - | - | - |
722
+ | 6.0 | 57 | - | 0.1515 | 0.1395 | 0.1348 | 0.1218 |
723
+ | 6.3158 | 60 | 2.0824 | - | - | - | - |
724
+ | 6.9474 | 66 | - | 0.1480 | 0.1411 | 0.1335 | 0.1242 |
725
+ | 7.3684 | 70 | 1.6734 | - | - | - | - |
726
+ | 8.0 | 76 | - | 0.1491 | 0.1481 | 0.1428 | 0.1313 |
727
+ | 8.4211 | 80 | 1.3894 | - | - | - | - |
728
+ | 8.9474 | 85 | - | 0.1449 | 0.1497 | 0.1419 | 0.1341 |
729
+ | 9.4737 | 90 | 1.1443 | - | - | - | - |
730
+ | 10.0 | 95 | - | 0.1466 | 0.1494 | 0.1399 | 0.1396 |
731
+ | 10.5263 | 100 | 1.0121 | - | - | - | - |
732
+ | 10.9474 | 104 | - | 0.1458 | 0.1477 | 0.1415 | 0.1371 |
733
+ | 11.5789 | 110 | 0.8833 | - | - | - | - |
734
+ | 12.0 | 114 | - | 0.1479 | 0.1474 | 0.1445 | 0.1374 |
735
+ | 12.6316 | 120 | 0.8201 | - | - | - | - |
736
+ | 12.9474 | 123 | - | 0.1519 | 0.1486 | 0.1458 | 0.1360 |
737
+ | 13.6842 | 130 | 0.736 | - | - | - | - |
738
+ | **14.0** | **133** | **-** | **0.1505** | **0.1471** | **0.1484** | **0.1376** |
739
+ | 14.7368 | 140 | 0.6924 | - | - | - | - |
740
+ | 14.9474 | 142 | - | 0.1496 | 0.1486 | 0.1451 | 0.1396 |
741
+ | 15.7895 | 150 | 0.672 | - | - | - | - |
742
+ | 16.0 | 152 | - | 0.1492 | 0.1489 | 0.1464 | 0.1404 |
743
+ | 16.8421 | 160 | 0.6455 | - | - | - | - |
744
+ | 16.9474 | 161 | - | 0.1496 | 0.1493 | 0.1468 | 0.1389 |
745
+ | 17.8947 | 170 | 0.6538 | - | - | - | - |
746
+ | 18.0 | 171 | - | 0.1501 | 0.1470 | 0.1461 | 0.1393 |
747
+ | 18.9474 | 180 | 0.628 | 0.1501 | 0.1489 | 0.1465 | 0.1390 |
748
+
749
+ * The bold row denotes the saved checkpoint.
750
+
751
+ ### Framework Versions
752
+ - Python: 3.10.12
753
+ - Sentence Transformers: 3.3.1
754
+ - Transformers: 4.41.2
755
+ - PyTorch: 2.1.2+cu121
756
+ - Accelerate: 0.34.2
757
+ - Datasets: 2.19.1
758
+ - Tokenizers: 0.19.1
759
+
760
+ ## Citation
761
+
762
+ ### BibTeX
763
+
764
+ #### Sentence Transformers
765
+ ```bibtex
766
+ @inproceedings{reimers-2019-sentence-bert,
767
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
768
+ author = "Reimers, Nils and Gurevych, Iryna",
769
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
770
+ month = "11",
771
+ year = "2019",
772
+ publisher = "Association for Computational Linguistics",
773
+ url = "https://arxiv.org/abs/1908.10084",
774
+ }
775
+ ```
776
+
777
+ #### MatryoshkaLoss
778
+ ```bibtex
779
+ @misc{kusupati2024matryoshka,
780
+ title={Matryoshka Representation Learning},
781
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
782
+ year={2024},
783
+ eprint={2205.13147},
784
+ archivePrefix={arXiv},
785
+ primaryClass={cs.LG}
786
+ }
787
+ ```
788
+
789
+ #### MultipleNegativesRankingLoss
790
+ ```bibtex
791
+ @misc{henderson2017efficient,
792
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
793
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
794
+ year={2017},
795
+ eprint={1705.00652},
796
+ archivePrefix={arXiv},
797
+ primaryClass={cs.CL}
798
+ }
799
+ ```
800
+
801
+ <!--
802
+ ## Glossary
803
+
804
+ *Clearly define terms in order to be accessible across audiences.*
805
+ -->
806
+
807
+ <!--
808
+ ## Model Card Authors
809
+
810
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
811
+ -->
812
+
813
+ <!--
814
+ ## Model Card Contact
815
+
816
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
817
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-MiniLM-L6-v2",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:acd7883d31509be9a9b2ca9d92d9b96b6d9be4abe4a4c027615ffb2541183bf9
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 128,
50
+ "model_max_length": 256,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff