---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:78279102
- loss:RZTKMatryoshka2dLoss
base_model: intfloat/multilingual-e5-base
widget:
- source_sentence: 'query: мужские кроссовки'
sentences:
- 'passage: Мужские кроссовки Bona 886TB 41 27 см Коричневые (BN_2000000205885)'
- 'passage: Чоловічі кросівки New Balance Габарити С Стандарт (до 300x200x250 мм)
Колір Чорний Матеріал верху Комбінований верх Матеріал підкладки Текстиль Матеріал
підошви Гума Розмір 40.5 Сезон Весняний Сезон Осінній Сезон Літній Кількість вантажних
місць 1 Країна реєстрації бренда США Країна-виробник товару Китай Призначення
Повсякденні Сегмент Спорт Мембрана Немає Розпродаж Товари зі знижкою Підошва Товста,
понад 2 см Наявність товара по містах Київ і область Доставка Готовий до відправлення
Доставка Доставка в магазини ROZETKA'
- 'passage: Чоловічі кросівки New Balance 574 Classic ML574EGN 40 (7.5) 25.5 см
Темно-сині (739655803697)'
- source_sentence: 'query: мужские кроссовки'
sentences:
- 'passage: Чоловічі кросівки New Balance Габарити С Стандарт (до 300x200x250 мм)
Колір Чорний Матеріал верху Текстиль Матеріал верху Шкіра Матеріал підкладки Текстиль
Матеріал підошви Гума Розмір 40.5 Сезон Весняний Сезон Осінній Сезон Літній Кількість
вантажних місць 1 Країна реєстрації бренда США Країна-виробник товару В''єтнам
Призначення Повсякденні Сегмент Спорт Мембрана Немає Тип гарантійного талона Гарантія
по чеку Доставка Premium Немає Доставка Доставка в магазини ROZETKA'
- 'passage: Мужские кроссовки ASICS Габариты_old C Стандарт (до 300x200x250 мм)
Цвет Серый Цвет Черный Материал верха Текстиль Материал верха Искусственная кожа
Материал подкладки Текстиль Материал подошвы Резина Размер 42.5 Сезон Осенний
Сезон Летний Сезон Весенний Количество грузовых мест 1 Страна регистрации бренда
Япония Страна-производитель товара Камбоджа Сегмент Спорт Конечная категория 1:С
Мужские кроссовки Мембрана Нет Подошва Толстая, более 2 см Наличие товара по городам
Киев и область Доставка Доставка в магазины ROZETKA'
- 'passage: Чоловічі кросівки New Balance Колір Темно-синій Матеріал верху Текстиль
Матеріал верху Синтетика Матеріал підкладки Текстиль Матеріал підошви Гума Розмір
47.5 Сезон Весняний Сезон Осінній Сезон Літній Кількість вантажних місць 1 Країна
реєстрації бренда США Країна-виробник товару В''єтнам Призначення Повсякденні
Сегмент Спорт Мембрана Немає Доставка Доставка в магазини ROZETKA'
- source_sentence: 'query: мужские кроссовки'
sentences:
- 'passage: Чоловічі кросівки Bona 884 VB\\4 47 31 см Чорні (BN_2000000231044)'
- 'passage: Мужские кроссовки Материал верха Искусственная кожа Материал верха Экокожа
Материал подкладки Текстиль Размер 44 Сезон Осенний Сезон Весенний Страна-производитель
товара Китай'
- 'passage: Навігаційні карти для GPS Garmin'
- source_sentence: 'query: полотенце уголок'
sentences:
- 'passage: Полотенце Baby Line Уголок 80х85 см Молочный с серым 302764Ж'
- 'passage: Мужские кроссовки New Balance Габариты_old C Стандарт (до 300x200x250
мм) Цвет Зеленый Материал верха Замша Материал верха Канвас Материал подкладки
Текстиль Материал подошвы EVA (этиленвинилацетат) Размер 46.5 Сезон Летний Сезон
Осенний Сезон Весенний Количество грузовых мест 1 Страна регистрации бренда США
Страна-производитель товара Индонезия Назначение Повседневные Сегмент Спорт Мембрана
Нет Тип кроссовок Сникеры Доставка Доставка в магазины ROZETKA'
- 'passage: Кросівки чоловічі Bonote 4415-755 44р текстиль червоні'
- source_sentence: 'query: мужские кроссовки'
sentences:
- 'passage: Чоловічі кросівки New Balance Колір Чорний Матеріал верху Замша Матеріал
верху Синтетика Матеріал підкладки Текстиль Матеріал підошви Гума Розмір 45.5
Сезон Весняний Сезон Осінній Кількість вантажних місць 1 Країна реєстрації бренда
США Країна-виробник товару В''єтнам Призначення Повсякденні Сегмент Спорт Мембрана
Немає Розпродаж Товари зі знижкою Підошва Товста, понад 2 см Наявність товара
по містах Київ і область Доставка Доставка в магазини ROZETKA Доставка Готовий
до відправлення'
- 'passage: Мужские кроссовки New Balance 393 ML393SS1 46.5 (12US) 30 см Синие (739980526742)'
- 'passage: Чоловічі кросівки New Balance Габарити С Стандарт (до 300x200x250 мм)
Колір Синій Матеріал верху Шкіра Матеріал верху Текстиль Матеріал підкладки Текстиль
Матеріал підошви EVA (етиленвінілоцетат) Розмір 40.5 Сезон Осінній Сезон Весняний
Сезон Літній Кількість вантажних місць 1 Країна реєстрації бренда США Країна-виробник
товару Індонезія Призначення Повсякденні Сегмент Спорт Кінцева категорія 1:С Чоловіче
взуття KSHCH Мембрана Немає Підошва Тонка, менш ніж 2 см Доставка Premium Немає
Наявність товара по містах Київ і область Доставка Доставка в магазини ROZETKA'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- dot_accuracy_10
- dot_precision_10
- dot_recall_10
- dot_ndcg_10
- dot_mrr_10
- dot_map_60
- dot_accuracy_1
- dot_accuracy_3
- dot_accuracy_5
- dot_precision_1
- dot_precision_3
- dot_precision_5
- dot_recall_1
- dot_recall_3
- dot_recall_5
- dot_map_100
- dot_ndcg_1
- dot_mrr_1
model-index:
- name: SentenceTransformer based on intfloat/multilingual-e5-base
results:
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'validation matryoshka dim 768 '
type: validation--matryoshka_dim-768--
metrics:
- type: dot_accuracy_10
value: 0.42588860687792074
name: Dot Accuracy 10
- type: dot_precision_10
value: 0.06357991743009335
name: Dot Precision 10
- type: dot_recall_10
value: 0.33517275284626125
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.2212029948255964
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.2060910608402061
name: Dot Mrr 10
- type: dot_map_60
value: 0.18852453983912593
name: Dot Map 60
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: bm full
type: bm-full
metrics:
- type: dot_accuracy_1
value: 0.6880692167577414
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7950819672131147
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8401639344262295
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.900728597449909
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6880692167577414
name: Dot Precision 1
- type: dot_precision_3
value: 0.6765330904675166
name: Dot Precision 3
- type: dot_precision_5
value: 0.6573770491803279
name: Dot Precision 5
- type: dot_precision_10
value: 0.6168032786885246
name: Dot Precision 10
- type: dot_recall_1
value: 0.04919472778546643
name: Dot Recall 1
- type: dot_recall_3
value: 0.14140951546955147
name: Dot Recall 3
- type: dot_recall_5
value: 0.2199045739985279
name: Dot Recall 5
- type: dot_recall_10
value: 0.3814062276483767
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.6616967554768699
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.750769797900945
name: Dot Mrr 10
- type: dot_map_100
value: 0.6325215080911376
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core uk title
type: core-uk-title
metrics:
- type: dot_accuracy_1
value: 0.7979002624671916
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9396325459317585
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.968503937007874
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9934383202099738
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7979002624671916
name: Dot Precision 1
- type: dot_precision_3
value: 0.7405949256342957
name: Dot Precision 3
- type: dot_precision_5
value: 0.647769028871391
name: Dot Precision 5
- type: dot_precision_10
value: 0.3937007874015748
name: Dot Precision 10
- type: dot_recall_1
value: 0.25151283172936717
name: Dot Recall 1
- type: dot_recall_3
value: 0.5937393242511353
name: Dot Recall 3
- type: dot_recall_5
value: 0.8012008915552222
name: Dot Recall 5
- type: dot_recall_10
value: 0.9422051410240386
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8743553654025616
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8704469233012537
name: Dot Mrr 10
- type: dot_map_100
value: 0.8369869170050552
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core ru title
type: core-ru-title
metrics:
- type: dot_accuracy_1
value: 0.800524934383202
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9356955380577427
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.963254593175853
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9921259842519685
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.800524934383202
name: Dot Precision 1
- type: dot_precision_3
value: 0.7436570428696414
name: Dot Precision 3
- type: dot_precision_5
value: 0.6517060367454067
name: Dot Precision 5
- type: dot_precision_10
value: 0.39435695538057736
name: Dot Precision 10
- type: dot_recall_1
value: 0.2526918510186227
name: Dot Recall 1
- type: dot_recall_3
value: 0.5919515268924718
name: Dot Recall 3
- type: dot_recall_5
value: 0.8057399075115611
name: Dot Recall 5
- type: dot_recall_10
value: 0.9417807149106362
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8763214781030534
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8702183060450777
name: Dot Mrr 10
- type: dot_map_100
value: 0.8401260583425492
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core uk options
type: core-uk-options
metrics:
- type: dot_accuracy_1
value: 0.7139107611548556
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8700787401574803
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9251968503937008
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9698162729658792
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7139107611548556
name: Dot Precision 1
- type: dot_precision_3
value: 0.6596675415573053
name: Dot Precision 3
- type: dot_precision_5
value: 0.5847769028871391
name: Dot Precision 5
- type: dot_precision_10
value: 0.378740157480315
name: Dot Precision 10
- type: dot_recall_1
value: 0.21595842186393369
name: Dot Recall 1
- type: dot_recall_3
value: 0.5151022788818065
name: Dot Recall 3
- type: dot_recall_5
value: 0.7152309086364205
name: Dot Recall 5
- type: dot_recall_10
value: 0.8968248760571595
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8049762538427163
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.802705390992793
name: Dot Mrr 10
- type: dot_map_100
value: 0.7555902593514171
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core ru options
type: core-ru-options
metrics:
- type: dot_accuracy_1
value: 0.7152230971128609
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8740157480314961
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9212598425196851
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.973753280839895
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7152230971128609
name: Dot Precision 1
- type: dot_precision_3
value: 0.6544181977252843
name: Dot Precision 3
- type: dot_precision_5
value: 0.5818897637795275
name: Dot Precision 5
- type: dot_precision_10
value: 0.3775590551181102
name: Dot Precision 10
- type: dot_recall_1
value: 0.21795400574928137
name: Dot Recall 1
- type: dot_recall_3
value: 0.5125494729825439
name: Dot Recall 3
- type: dot_recall_5
value: 0.7149486522518018
name: Dot Recall 5
- type: dot_recall_10
value: 0.896090072074324
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8034790842980009
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8032173061700623
name: Dot Mrr 10
- type: dot_map_100
value: 0.7534769650854939
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options uk title
type: options-uk-title
metrics:
- type: dot_accuracy_1
value: 0.8058252427184466
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9563106796116505
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9733009708737864
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9902912621359223
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8058252427184466
name: Dot Precision 1
- type: dot_precision_3
value: 0.7508090614886731
name: Dot Precision 3
- type: dot_precision_5
value: 0.587378640776699
name: Dot Precision 5
- type: dot_precision_10
value: 0.33519417475728147
name: Dot Precision 10
- type: dot_recall_1
value: 0.25022538141470185
name: Dot Recall 1
- type: dot_recall_3
value: 0.6713881183541378
name: Dot Recall 3
- type: dot_recall_5
value: 0.8486592695330559
name: Dot Recall 5
- type: dot_recall_10
value: 0.9589603559870551
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8791414383223461
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8813309061488673
name: Dot Mrr 10
- type: dot_map_100
value: 0.8250647619341923
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options ru title
type: options-ru-title
metrics:
- type: dot_accuracy_1
value: 0.8155339805825242
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9563106796116505
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9781553398058253
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9975728155339806
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8155339805825242
name: Dot Precision 1
- type: dot_precision_3
value: 0.7508090614886731
name: Dot Precision 3
- type: dot_precision_5
value: 0.5868932038834951
name: Dot Precision 5
- type: dot_precision_10
value: 0.3364077669902912
name: Dot Precision 10
- type: dot_recall_1
value: 0.2539875173370319
name: Dot Recall 1
- type: dot_recall_3
value: 0.670659963014332
name: Dot Recall 3
- type: dot_recall_5
value: 0.8476277161349977
name: Dot Recall 5
- type: dot_recall_10
value: 0.9617718446601942
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8813867571126252
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8869394359685622
name: Dot Mrr 10
- type: dot_map_100
value: 0.8258079727120101
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options uk options
type: options-uk-options
metrics:
- type: dot_accuracy_1
value: 0.6820388349514563
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8616504854368932
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9199029126213593
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9733009708737864
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6820388349514563
name: Dot Precision 1
- type: dot_precision_3
value: 0.6310679611650486
name: Dot Precision 3
- type: dot_precision_5
value: 0.5101941747572816
name: Dot Precision 5
- type: dot_precision_10
value: 0.31092233009708736
name: Dot Precision 10
- type: dot_recall_1
value: 0.21201745261211283
name: Dot Recall 1
- type: dot_recall_3
value: 0.5627716134997689
name: Dot Recall 3
- type: dot_recall_5
value: 0.7359367776236707
name: Dot Recall 5
- type: dot_recall_10
value: 0.884203074433657
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.780013643445193
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7837686854677148
name: Dot Mrr 10
- type: dot_map_100
value: 0.7174231143401113
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: options ru options
type: options-ru-options
metrics:
- type: dot_accuracy_1
value: 0.6966019417475728
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8592233009708737
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9150485436893204
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9611650485436893
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6966019417475728
name: Dot Precision 1
- type: dot_precision_3
value: 0.6351132686084143
name: Dot Precision 3
- type: dot_precision_5
value: 0.5077669902912622
name: Dot Precision 5
- type: dot_precision_10
value: 0.30946601941747565
name: Dot Precision 10
- type: dot_recall_1
value: 0.21699318076745258
name: Dot Recall 1
- type: dot_recall_3
value: 0.5668169209431345
name: Dot Recall 3
- type: dot_recall_5
value: 0.7341568423485899
name: Dot Recall 5
- type: dot_recall_10
value: 0.8813309061488673
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.781594298753812
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7883071351517954
name: Dot Mrr 10
- type: dot_map_100
value: 0.7214648145989936
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms uk title
type: rusisms-uk-title
metrics:
- type: dot_accuracy_1
value: 0.8692307692307693
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9153846153846154
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9384615384615385
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9615384615384616
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8692307692307693
name: Dot Precision 1
- type: dot_precision_3
value: 0.8102564102564103
name: Dot Precision 3
- type: dot_precision_5
value: 0.7676923076923077
name: Dot Precision 5
- type: dot_precision_10
value: 0.6692307692307692
name: Dot Precision 10
- type: dot_recall_1
value: 0.1772882378234196
name: Dot Recall 1
- type: dot_recall_3
value: 0.3654320398636721
name: Dot Recall 3
- type: dot_recall_5
value: 0.5078467457656214
name: Dot Recall 5
- type: dot_recall_10
value: 0.734180656185287
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.885617714259767
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8986538461538461
name: Dot Mrr 10
- type: dot_map_100
value: 0.8819193356867251
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms ru title
type: rusisms-ru-title
metrics:
- type: dot_accuracy_1
value: 0.8769230769230769
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9230769230769231
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9307692307692308
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9615384615384616
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8769230769230769
name: Dot Precision 1
- type: dot_precision_3
value: 0.8153846153846154
name: Dot Precision 3
- type: dot_precision_5
value: 0.7784615384615384
name: Dot Precision 5
- type: dot_precision_10
value: 0.6684615384615386
name: Dot Precision 10
- type: dot_recall_1
value: 0.17684867738385918
name: Dot Recall 1
- type: dot_recall_3
value: 0.37118778738412556
name: Dot Recall 3
- type: dot_recall_5
value: 0.5151083730272487
name: Dot Recall 5
- type: dot_recall_10
value: 0.7306347757372329
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.887297925013266
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.9031715506715506
name: Dot Mrr 10
- type: dot_map_100
value: 0.8851857301797741
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms uk options
type: rusisms-uk-options
metrics:
- type: dot_accuracy_1
value: 0.7538461538461538
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8461538461538461
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8692307692307693
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9307692307692308
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7538461538461538
name: Dot Precision 1
- type: dot_precision_3
value: 0.7256410256410257
name: Dot Precision 3
- type: dot_precision_5
value: 0.6938461538461538
name: Dot Precision 5
- type: dot_precision_10
value: 0.6230769230769231
name: Dot Precision 10
- type: dot_recall_1
value: 0.1452272791032537
name: Dot Recall 1
- type: dot_recall_3
value: 0.3350024563042932
name: Dot Recall 3
- type: dot_recall_5
value: 0.4603957653734645
name: Dot Recall 5
- type: dot_recall_10
value: 0.6929500632749596
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8086707693665091
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8050732600732601
name: Dot Mrr 10
- type: dot_map_100
value: 0.8080217578154502
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms ru options
type: rusisms-ru-options
metrics:
- type: dot_accuracy_1
value: 0.7846153846153846
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8615384615384616
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8923076923076924
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9384615384615385
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7846153846153846
name: Dot Precision 1
- type: dot_precision_3
value: 0.7435897435897435
name: Dot Precision 3
- type: dot_precision_5
value: 0.7092307692307693
name: Dot Precision 5
- type: dot_precision_10
value: 0.6276923076923077
name: Dot Precision 10
- type: dot_recall_1
value: 0.15657018137006903
name: Dot Recall 1
- type: dot_recall_3
value: 0.34203614157327256
name: Dot Recall 3
- type: dot_recall_5
value: 0.4736550306180239
name: Dot Recall 5
- type: dot_recall_10
value: 0.7020854572217345
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8248849777294873
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8334157509157508
name: Dot Mrr 10
- type: dot_map_100
value: 0.8218542435311202
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected uk title
type: rusisms_corrected-uk-title
metrics:
- type: dot_accuracy_1
value: 0.9076923076923077
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9769230769230769
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9846153846153847
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9923076923076923
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.9076923076923077
name: Dot Precision 1
- type: dot_precision_3
value: 0.8512820512820513
name: Dot Precision 3
- type: dot_precision_5
value: 0.8015384615384615
name: Dot Precision 5
- type: dot_precision_10
value: 0.6915384615384615
name: Dot Precision 10
- type: dot_recall_1
value: 0.18307445643572517
name: Dot Recall 1
- type: dot_recall_3
value: 0.39418935103786307
name: Dot Recall 3
- type: dot_recall_5
value: 0.5402311488021982
name: Dot Recall 5
- type: dot_recall_10
value: 0.7692994472968094
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.9264250742035229
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.9442307692307693
name: Dot Mrr 10
- type: dot_map_100
value: 0.9165774988153887
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected ru title
type: rusisms_corrected-ru-title
metrics:
- type: dot_accuracy_1
value: 0.9153846153846154
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9538461538461539
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9769230769230769
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 1.0
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.9153846153846154
name: Dot Precision 1
- type: dot_precision_3
value: 0.8461538461538461
name: Dot Precision 3
- type: dot_precision_5
value: 0.8
name: Dot Precision 5
- type: dot_precision_10
value: 0.6953846153846154
name: Dot Precision 10
- type: dot_recall_1
value: 0.18422830258957132
name: Dot Recall 1
- type: dot_recall_3
value: 0.3931672032510094
name: Dot Recall 3
- type: dot_recall_5
value: 0.5419787067997563
name: Dot Recall 5
- type: dot_recall_10
value: 0.7813827806301428
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.9303098957888767
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.9422435897435898
name: Dot Mrr 10
- type: dot_map_100
value: 0.9161565073213329
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected uk options
type: rusisms_corrected-uk-options
metrics:
- type: dot_accuracy_1
value: 0.8307692307692308
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8846153846153846
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9307692307692308
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9692307692307692
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8307692307692308
name: Dot Precision 1
- type: dot_precision_3
value: 0.7846153846153846
name: Dot Precision 3
- type: dot_precision_5
value: 0.7476923076923078
name: Dot Precision 5
- type: dot_precision_10
value: 0.6684615384615384
name: Dot Precision 10
- type: dot_recall_1
value: 0.1620099408859155
name: Dot Recall 1
- type: dot_recall_3
value: 0.3498019084151263
name: Dot Recall 3
- type: dot_recall_5
value: 0.48840890886463323
name: Dot Recall 5
- type: dot_recall_10
value: 0.7411561720134413
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8712426951769946
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8702014652014651
name: Dot Mrr 10
- type: dot_map_100
value: 0.8644624071354806
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: rusisms corrected ru options
type: rusisms_corrected-ru-options
metrics:
- type: dot_accuracy_1
value: 0.8538461538461538
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.9153846153846154
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9384615384615385
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9923076923076923
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8538461538461538
name: Dot Precision 1
- type: dot_precision_3
value: 0.8
name: Dot Precision 3
- type: dot_precision_5
value: 0.7569230769230769
name: Dot Precision 5
- type: dot_precision_10
value: 0.6792307692307691
name: Dot Precision 10
- type: dot_recall_1
value: 0.1651776310388998
name: Dot Recall 1
- type: dot_recall_3
value: 0.36901355461206664
name: Dot Recall 3
- type: dot_recall_5
value: 0.5018008192418377
name: Dot Recall 5
- type: dot_recall_10
value: 0.7564499451477719
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.890585132383007
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8962912087912088
name: Dot Mrr 10
- type: dot_map_100
value: 0.8764066453581607
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos uk title
type: core_typos-uk-title
metrics:
- type: dot_accuracy_1
value: 0.7335958005249343
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8923884514435696
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.9383202099737533
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9711286089238845
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7335958005249343
name: Dot Precision 1
- type: dot_precision_3
value: 0.6771653543307087
name: Dot Precision 3
- type: dot_precision_5
value: 0.5965879265091864
name: Dot Precision 5
- type: dot_precision_10
value: 0.37099737532808397
name: Dot Precision 10
- type: dot_recall_1
value: 0.22345696371286924
name: Dot Recall 1
- type: dot_recall_3
value: 0.537916614589843
name: Dot Recall 3
- type: dot_recall_5
value: 0.7394685039370078
name: Dot Recall 5
- type: dot_recall_10
value: 0.8864074282381368
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8103439591838156
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8195814064908555
name: Dot Mrr 10
- type: dot_map_100
value: 0.7667606510148323
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos ru title
type: core_typos-ru-title
metrics:
- type: dot_accuracy_1
value: 0.7349081364829396
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8884514435695539
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.931758530183727
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9698162729658792
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7349081364829396
name: Dot Precision 1
- type: dot_precision_3
value: 0.6824146981627297
name: Dot Precision 3
- type: dot_precision_5
value: 0.5979002624671916
name: Dot Precision 5
- type: dot_precision_10
value: 0.3728346456692913
name: Dot Precision 10
- type: dot_recall_1
value: 0.2256493979919177
name: Dot Recall 1
- type: dot_recall_3
value: 0.5402949631296088
name: Dot Recall 3
- type: dot_recall_5
value: 0.7408495813023372
name: Dot Recall 5
- type: dot_recall_10
value: 0.8911636045494314
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8135683948289655
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8157193892430111
name: Dot Mrr 10
- type: dot_map_100
value: 0.7702420513632691
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos uk options
type: core_typos-uk-options
metrics:
- type: dot_accuracy_1
value: 0.636482939632546
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7979002624671916
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8622047244094488
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.931758530183727
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.636482939632546
name: Dot Precision 1
- type: dot_precision_3
value: 0.5796150481189851
name: Dot Precision 3
- type: dot_precision_5
value: 0.5183727034120735
name: Dot Precision 5
- type: dot_precision_10
value: 0.34409448818897637
name: Dot Precision 10
- type: dot_recall_1
value: 0.18990490771986834
name: Dot Recall 1
- type: dot_recall_3
value: 0.448730887805691
name: Dot Recall 3
- type: dot_recall_5
value: 0.6348623088780568
name: Dot Recall 5
- type: dot_recall_10
value: 0.8162078698496021
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7214244648930405
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7316762488022333
name: Dot Mrr 10
- type: dot_map_100
value: 0.6729057291822153
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: core typos ru options
type: core_typos-ru-options
metrics:
- type: dot_accuracy_1
value: 0.6351706036745407
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7847769028871391
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8582677165354331
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.937007874015748
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.6351706036745407
name: Dot Precision 1
- type: dot_precision_3
value: 0.5761154855643045
name: Dot Precision 3
- type: dot_precision_5
value: 0.5173228346456693
name: Dot Precision 5
- type: dot_precision_10
value: 0.34593175853018376
name: Dot Precision 10
- type: dot_recall_1
value: 0.19054441111527726
name: Dot Recall 1
- type: dot_recall_3
value: 0.44878973461650623
name: Dot Recall 3
- type: dot_recall_5
value: 0.6357241803107945
name: Dot Recall 5
- type: dot_recall_10
value: 0.8196230679498395
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7230020232672213
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7299978127734036
name: Dot Mrr 10
- type: dot_map_100
value: 0.6728700841478042
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: vespa uk title
type: vespa-uk-title
metrics:
- type: dot_accuracy_1
value: 0.8709677419354839
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8817204301075269
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8817204301075269
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9247311827956989
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8709677419354839
name: Dot Precision 1
- type: dot_precision_3
value: 0.8028673835125448
name: Dot Precision 3
- type: dot_precision_5
value: 0.7612903225806451
name: Dot Precision 5
- type: dot_precision_10
value: 0.6946236559139786
name: Dot Precision 10
- type: dot_recall_1
value: 0.14356256029238396
name: Dot Recall 1
- type: dot_recall_3
value: 0.2696089420858031
name: Dot Recall 3
- type: dot_recall_5
value: 0.34637894043861595
name: Dot Recall 5
- type: dot_recall_10
value: 0.47693319426808783
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8727078333970919
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8814217443249702
name: Dot Mrr 10
- type: dot_map_100
value: 0.8669030577552067
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: vespa ru title
type: vespa-ru-title
metrics:
- type: dot_accuracy_1
value: 0.8494623655913979
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8817204301075269
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8924731182795699
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.9032258064516129
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.8494623655913979
name: Dot Precision 1
- type: dot_precision_3
value: 0.7956989247311828
name: Dot Precision 3
- type: dot_precision_5
value: 0.7548387096774194
name: Dot Precision 5
- type: dot_precision_10
value: 0.6860215053763441
name: Dot Precision 10
- type: dot_recall_1
value: 0.14082995565362097
name: Dot Recall 1
- type: dot_recall_3
value: 0.26920608655098277
name: Dot Recall 3
- type: dot_recall_5
value: 0.34688986646273845
name: Dot Recall 5
- type: dot_recall_10
value: 0.4599598002785648
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.8595557706598426
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.8672939068100358
name: Dot Mrr 10
- type: dot_map_100
value: 0.8558400344610504
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: vespa uk options
type: vespa-uk-options
metrics:
- type: dot_accuracy_1
value: 0.7526881720430108
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.7956989247311828
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8279569892473119
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.8387096774193549
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7526881720430108
name: Dot Precision 1
- type: dot_precision_3
value: 0.7168458781362006
name: Dot Precision 3
- type: dot_precision_5
value: 0.6903225806451613
name: Dot Precision 5
- type: dot_precision_10
value: 0.643010752688172
name: Dot Precision 10
- type: dot_recall_1
value: 0.10681817889990873
name: Dot Recall 1
- type: dot_recall_3
value: 0.2094140038954732
name: Dot Recall 3
- type: dot_recall_5
value: 0.2810704576746759
name: Dot Recall 5
- type: dot_recall_10
value: 0.4028083354913771
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7757647000893699
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7811827956989247
name: Dot Mrr 10
- type: dot_map_100
value: 0.7471494381391597
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: vespa ru options
type: vespa-ru-options
metrics:
- type: dot_accuracy_1
value: 0.7526881720430108
name: Dot Accuracy 1
- type: dot_accuracy_3
value: 0.8172043010752689
name: Dot Accuracy 3
- type: dot_accuracy_5
value: 0.8387096774193549
name: Dot Accuracy 5
- type: dot_accuracy_10
value: 0.8387096774193549
name: Dot Accuracy 10
- type: dot_precision_1
value: 0.7526881720430108
name: Dot Precision 1
- type: dot_precision_3
value: 0.7132616487455197
name: Dot Precision 3
- type: dot_precision_5
value: 0.6731182795698925
name: Dot Precision 5
- type: dot_precision_10
value: 0.6215053763440861
name: Dot Precision 10
- type: dot_recall_1
value: 0.10197946922248938
name: Dot Recall 1
- type: dot_recall_3
value: 0.2060788464076136
name: Dot Recall 3
- type: dot_recall_5
value: 0.27440671690955265
name: Dot Recall 5
- type: dot_recall_10
value: 0.3816496590177595
name: Dot Recall 10
- type: dot_ndcg_10
value: 0.7524234839353887
name: Dot Ndcg 10
- type: dot_mrr_10
value: 0.7838709677419355
name: Dot Mrr 10
- type: dot_map_100
value: 0.7264872998412837
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 768 '
type: bm-full--matryoshka_dim-768--
metrics:
- type: dot_accuracy_1
value: 0.6880692167577414
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6880692167577414
name: Dot Precision 1
- type: dot_recall_1
value: 0.04919472778546643
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6880692167577414
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6880692167577414
name: Dot Mrr 1
- type: dot_map_100
value: 0.6325215080911376
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 512 '
type: bm-full--matryoshka_dim-512--
metrics:
- type: dot_accuracy_1
value: 0.6771402550091075
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6771402550091075
name: Dot Precision 1
- type: dot_recall_1
value: 0.04812670377587935
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6771402550091075
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6771402550091075
name: Dot Mrr 1
- type: dot_map_100
value: 0.6267606209459006
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 256 '
type: bm-full--matryoshka_dim-256--
metrics:
- type: dot_accuracy_1
value: 0.6653005464480874
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6653005464480874
name: Dot Precision 1
- type: dot_recall_1
value: 0.04734072895659965
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6653005464480874
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6653005464480874
name: Dot Mrr 1
- type: dot_map_100
value: 0.6160451076839756
name: Dot Map 100
- task:
type: rztkinformation-retrieval
name: RZTKInformation Retrieval
dataset:
name: 'bm full matryoshka dim 128 '
type: bm-full--matryoshka_dim-128--
metrics:
- type: dot_accuracy_1
value: 0.6530054644808743
name: Dot Accuracy 1
- type: dot_precision_1
value: 0.6530054644808743
name: Dot Precision 1
- type: dot_recall_1
value: 0.045669118220938276
name: Dot Recall 1
- type: dot_ndcg_1
value: 0.6530054644808743
name: Dot Ndcg 1
- type: dot_mrr_1
value: 0.6530054644808743
name: Dot Mrr 1
- type: dot_map_100
value: 0.586734716577342
name: Dot Map 100
---
# SentenceTransformer based on intfloat/multilingual-e5-base
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) on the rozetka_positive_pairs dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base)
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Dot Product
- **Training Dataset:**
- rozetka_positive_pairs
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
RZTKSentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yklymchuk-rztk/multilingual-e5-base-matryoshka2d-mnr-13")
# Run inference
sentences = [
'query: мужские кроссовки',
"passage: Чоловічі кросівки New Balance Колір Чорний Матеріал верху Замша Матеріал верху Синтетика Матеріал підкладки Текстиль Матеріал підошви Гума Розмір 45.5 Сезон Весняний Сезон Осінній Кількість вантажних місць 1 Країна реєстрації бренда США Країна-виробник товару В'єтнам Призначення Повсякденні Сегмент Спорт Мембрана Немає Розпродаж Товари зі знижкою Підошва Товста, понад 2 см Наявність товара по містах Київ і область Доставка Доставка в магазини ROZETKA Доставка Готовий до відправлення",
'passage: Мужские кроссовки New Balance 393 ML393SS1 46.5 (12US) 30 см Синие (739980526742)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### RZTKInformation Retrieval
* Dataset: `validation--matryoshka_dim-768--`
* Evaluated with sentence_transformers_training.evaluation.information_retrieval_evaluator.RZTKInformationRetrievalEvaluator
| Metric | Value |
|:-----------------|:-----------|
| dot_accuracy_10 | 0.4259 |
| dot_precision_10 | 0.0636 |
| dot_recall_10 | 0.3352 |
| **dot_ndcg_10** | **0.2212** |
| dot_mrr_10 | 0.2061 |
| dot_map_60 | 0.1885 |
#### RZTKInformation Retrieval
* Datasets: `bm-full`, `core-uk-title`, `core-ru-title`, `core-uk-options`, `core-ru-options`, `options-uk-title`, `options-ru-title`, `options-uk-options`, `options-ru-options`, `rusisms-uk-title`, `rusisms-ru-title`, `rusisms-uk-options`, `rusisms-ru-options`, `rusisms_corrected-uk-title`, `rusisms_corrected-ru-title`, `rusisms_corrected-uk-options`, `rusisms_corrected-ru-options`, `core_typos-uk-title`, `core_typos-ru-title`, `core_typos-uk-options`, `core_typos-ru-options`, `vespa-uk-title`, `vespa-ru-title`, `vespa-uk-options` and `vespa-ru-options`
* Evaluated with sentence_transformers_training.evaluation.information_retrieval_evaluator.RZTKInformationRetrievalEvaluator
| Metric | bm-full | core-uk-title | core-ru-title | core-uk-options | core-ru-options | options-uk-title | options-ru-title | options-uk-options | options-ru-options | rusisms-uk-title | rusisms-ru-title | rusisms-uk-options | rusisms-ru-options | rusisms_corrected-uk-title | rusisms_corrected-ru-title | rusisms_corrected-uk-options | rusisms_corrected-ru-options | core_typos-uk-title | core_typos-ru-title | core_typos-uk-options | core_typos-ru-options | vespa-uk-title | vespa-ru-title | vespa-uk-options | vespa-ru-options |
|:-----------------|:-----------|:--------------|:--------------|:----------------|:----------------|:-----------------|:-----------------|:-------------------|:-------------------|:-----------------|:-----------------|:-------------------|:-------------------|:---------------------------|:---------------------------|:-----------------------------|:-----------------------------|:--------------------|:--------------------|:----------------------|:----------------------|:---------------|:---------------|:-----------------|:-----------------|
| dot_accuracy_1 | 0.6881 | 0.7979 | 0.8005 | 0.7139 | 0.7152 | 0.8058 | 0.8155 | 0.682 | 0.6966 | 0.8692 | 0.8769 | 0.7538 | 0.7846 | 0.9077 | 0.9154 | 0.8308 | 0.8538 | 0.7336 | 0.7349 | 0.6365 | 0.6352 | 0.871 | 0.8495 | 0.7527 | 0.7527 |
| dot_accuracy_3 | 0.7951 | 0.9396 | 0.9357 | 0.8701 | 0.874 | 0.9563 | 0.9563 | 0.8617 | 0.8592 | 0.9154 | 0.9231 | 0.8462 | 0.8615 | 0.9769 | 0.9538 | 0.8846 | 0.9154 | 0.8924 | 0.8885 | 0.7979 | 0.7848 | 0.8817 | 0.8817 | 0.7957 | 0.8172 |
| dot_accuracy_5 | 0.8402 | 0.9685 | 0.9633 | 0.9252 | 0.9213 | 0.9733 | 0.9782 | 0.9199 | 0.915 | 0.9385 | 0.9308 | 0.8692 | 0.8923 | 0.9846 | 0.9769 | 0.9308 | 0.9385 | 0.9383 | 0.9318 | 0.8622 | 0.8583 | 0.8817 | 0.8925 | 0.828 | 0.8387 |
| dot_accuracy_10 | 0.9007 | 0.9934 | 0.9921 | 0.9698 | 0.9738 | 0.9903 | 0.9976 | 0.9733 | 0.9612 | 0.9615 | 0.9615 | 0.9308 | 0.9385 | 0.9923 | 1.0 | 0.9692 | 0.9923 | 0.9711 | 0.9698 | 0.9318 | 0.937 | 0.9247 | 0.9032 | 0.8387 | 0.8387 |
| dot_precision_1 | 0.6881 | 0.7979 | 0.8005 | 0.7139 | 0.7152 | 0.8058 | 0.8155 | 0.682 | 0.6966 | 0.8692 | 0.8769 | 0.7538 | 0.7846 | 0.9077 | 0.9154 | 0.8308 | 0.8538 | 0.7336 | 0.7349 | 0.6365 | 0.6352 | 0.871 | 0.8495 | 0.7527 | 0.7527 |
| dot_precision_3 | 0.6765 | 0.7406 | 0.7437 | 0.6597 | 0.6544 | 0.7508 | 0.7508 | 0.6311 | 0.6351 | 0.8103 | 0.8154 | 0.7256 | 0.7436 | 0.8513 | 0.8462 | 0.7846 | 0.8 | 0.6772 | 0.6824 | 0.5796 | 0.5761 | 0.8029 | 0.7957 | 0.7168 | 0.7133 |
| dot_precision_5 | 0.6574 | 0.6478 | 0.6517 | 0.5848 | 0.5819 | 0.5874 | 0.5869 | 0.5102 | 0.5078 | 0.7677 | 0.7785 | 0.6938 | 0.7092 | 0.8015 | 0.8 | 0.7477 | 0.7569 | 0.5966 | 0.5979 | 0.5184 | 0.5173 | 0.7613 | 0.7548 | 0.6903 | 0.6731 |
| dot_precision_10 | 0.6168 | 0.3937 | 0.3944 | 0.3787 | 0.3776 | 0.3352 | 0.3364 | 0.3109 | 0.3095 | 0.6692 | 0.6685 | 0.6231 | 0.6277 | 0.6915 | 0.6954 | 0.6685 | 0.6792 | 0.371 | 0.3728 | 0.3441 | 0.3459 | 0.6946 | 0.686 | 0.643 | 0.6215 |
| dot_recall_1 | 0.0492 | 0.2515 | 0.2527 | 0.216 | 0.218 | 0.2502 | 0.254 | 0.212 | 0.217 | 0.1773 | 0.1768 | 0.1452 | 0.1566 | 0.1831 | 0.1842 | 0.162 | 0.1652 | 0.2235 | 0.2256 | 0.1899 | 0.1905 | 0.1436 | 0.1408 | 0.1068 | 0.102 |
| dot_recall_3 | 0.1414 | 0.5937 | 0.592 | 0.5151 | 0.5125 | 0.6714 | 0.6707 | 0.5628 | 0.5668 | 0.3654 | 0.3712 | 0.335 | 0.342 | 0.3942 | 0.3932 | 0.3498 | 0.369 | 0.5379 | 0.5403 | 0.4487 | 0.4488 | 0.2696 | 0.2692 | 0.2094 | 0.2061 |
| dot_recall_5 | 0.2199 | 0.8012 | 0.8057 | 0.7152 | 0.7149 | 0.8487 | 0.8476 | 0.7359 | 0.7342 | 0.5078 | 0.5151 | 0.4604 | 0.4737 | 0.5402 | 0.542 | 0.4884 | 0.5018 | 0.7395 | 0.7408 | 0.6349 | 0.6357 | 0.3464 | 0.3469 | 0.2811 | 0.2744 |
| dot_recall_10 | 0.3814 | 0.9422 | 0.9418 | 0.8968 | 0.8961 | 0.959 | 0.9618 | 0.8842 | 0.8813 | 0.7342 | 0.7306 | 0.693 | 0.7021 | 0.7693 | 0.7814 | 0.7412 | 0.7564 | 0.8864 | 0.8912 | 0.8162 | 0.8196 | 0.4769 | 0.46 | 0.4028 | 0.3816 |
| **dot_ndcg_10** | **0.6617** | **0.8744** | **0.8763** | **0.805** | **0.8035** | **0.8791** | **0.8814** | **0.78** | **0.7816** | **0.8856** | **0.8873** | **0.8087** | **0.8249** | **0.9264** | **0.9303** | **0.8712** | **0.8906** | **0.8103** | **0.8136** | **0.7214** | **0.723** | **0.8727** | **0.8596** | **0.7758** | **0.7524** |
| dot_mrr_10 | 0.7508 | 0.8704 | 0.8702 | 0.8027 | 0.8032 | 0.8813 | 0.8869 | 0.7838 | 0.7883 | 0.8987 | 0.9032 | 0.8051 | 0.8334 | 0.9442 | 0.9422 | 0.8702 | 0.8963 | 0.8196 | 0.8157 | 0.7317 | 0.73 | 0.8814 | 0.8673 | 0.7812 | 0.7839 |
| dot_map_100 | 0.6325 | 0.837 | 0.8401 | 0.7556 | 0.7535 | 0.8251 | 0.8258 | 0.7174 | 0.7215 | 0.8819 | 0.8852 | 0.808 | 0.8219 | 0.9166 | 0.9162 | 0.8645 | 0.8764 | 0.7668 | 0.7702 | 0.6729 | 0.6729 | 0.8669 | 0.8558 | 0.7471 | 0.7265 |
#### RZTKInformation Retrieval
* Datasets: `bm-full--matryoshka_dim-768--`, `bm-full--matryoshka_dim-512--`, `bm-full--matryoshka_dim-256--` and `bm-full--matryoshka_dim-128--`
* Evaluated with sentence_transformers_training.evaluation.information_retrieval_evaluator.RZTKInformationRetrievalEvaluator
| Metric | bm-full--matryoshka_dim-768-- | bm-full--matryoshka_dim-512-- | bm-full--matryoshka_dim-256-- | bm-full--matryoshka_dim-128-- |
|:----------------|:------------------------------|:------------------------------|:------------------------------|:------------------------------|
| dot_accuracy_1 | 0.6881 | 0.6771 | 0.6653 | 0.653 |
| dot_precision_1 | 0.6881 | 0.6771 | 0.6653 | 0.653 |
| dot_recall_1 | 0.0492 | 0.0481 | 0.0473 | 0.0457 |
| **dot_ndcg_1** | **0.6881** | **0.6771** | **0.6653** | **0.653** |
| dot_mrr_1 | 0.6881 | 0.6771 | 0.6653 | 0.653 |
| dot_map_100 | 0.6325 | 0.6268 | 0.616 | 0.5867 |
## Training Details
### Training Dataset
#### rozetka_positive_pairs
* Dataset: rozetka_positive_pairs
* Size: 78,279,102 training samples
* Columns: query
and text
* Approximate statistics based on the first 1000 samples:
| | query | text |
|:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details |
query: campingaz fold n cool classic 10l dark blue
| passage: Термосумка Campingaz Fold'n Cool Classic 10L Dark Blue (4823082704729)
|
| query: campingaz fold n cool classic 10l dark blue
| passage: Термопродукція Campingaz Гарантія 14 днів Вид Термосумки Колір Синій з білим Режим роботи Охолодження Країна реєстрації бренда Франція Країна-виробник товару Китай Тип гарантійного талона Гарантія по чеку Можливість доставки Почтомати Доставка Premium Немає
|
| query: campingaz fold n cool classic 10l dark blue
| passage: Термосумка Campingaz Fold'n Cool Classic 10L Dark Blue (4823082704729)
|
* Loss: sentence_transformers_training.model.matryoshka2d_loss.RZTKMatryoshka2dLoss
with these parameters:
```json
{
"loss": "RZTKMultipleNegativesRankingLoss",
"n_layers_per_step": 1,
"last_layer_weight": 1.0,
"prior_layers_weight": 1.0,
"kl_div_weight": 1.0,
"kl_temperature": 0.3,
"matryoshka_dims": [
768,
512,
256,
128
],
"matryoshka_weights": [
1,
1,
1,
1
],
"n_dims_per_step": 1
}
```
### Evaluation Dataset
#### rozetka_positive_pairs
* Dataset: rozetka_positive_pairs
* Size: 1,000,000 evaluation samples
* Columns: query
and text
* Approximate statistics based on the first 1000 samples:
| | query | text |
|:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details | query: топографическая карта
| passage: Топографічна карта України Garmin
|
| query: топографическая карта
| passage: Навігаційні карти для GPS Garmin
|
| query: топографическая карта
| passage: Топографическая карта Украины Garmin
|
* Loss: sentence_transformers_training.model.matryoshka2d_loss.RZTKMatryoshka2dLoss
with these parameters:
```json
{
"loss": "RZTKMultipleNegativesRankingLoss",
"n_layers_per_step": 1,
"last_layer_weight": 1.0,
"prior_layers_weight": 1.0,
"kl_div_weight": 1.0,
"kl_temperature": 0.3,
"matryoshka_dims": [
768,
512,
256,
128
],
"matryoshka_weights": [
1,
1,
1,
1
],
"n_dims_per_step": 1
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 88
- `per_device_eval_batch_size`: 88
- `learning_rate`: 2e-05
- `num_train_epochs`: 1.0
- `warmup_ratio`: 0.1
- `bf16`: True
- `bf16_full_eval`: True
- `tf32`: True
- `dataloader_num_workers`: 4
- `load_best_model_at_end`: True
- `optim`: adafactor
- `push_to_hub`: True
- `hub_model_id`: yklymchuk-rztk/multilingual-e5-base-matryoshka2d-mnr-13
- `hub_private_repo`: True
- `prompts`: {'query': 'query: ', 'text': 'passage: '}
- `batch_sampler`: no_duplicates
#### All Hyperparameters