README.md · elastic/multilingual-e5-small-optimized at 08ce77c7cc249e46fa4061d020d29dd570146cdf

metadata

pipeline_tag: sentence-similarity
tags:
  - sentence-similarity
  - sentence-transformers
license: mit
language:
  - multilingual
  - af
  - am
  - ar
  - as
  - az
  - be
  - bg
  - bn
  - br
  - bs
  - ca
  - cs
  - cy
  - da
  - de
  - el
  - en
  - eo
  - es
  - et
  - eu
  - fa
  - fi
  - fr
  - fy
  - ga
  - gd
  - gl
  - gu
  - ha
  - he
  - hi
  - hr
  - hu
  - hy
  - id
  - is
  - it
  - ja
  - jv
  - ka
  - kk
  - km
  - kn
  - ko
  - ku
  - ky
  - la
  - lo
  - lt
  - lv
  - mg
  - mk
  - ml
  - mn
  - mr
  - ms
  - my
  - ne
  - nl
  - 'no'
  - om
  - or
  - pa
  - pl
  - ps
  - pt
  - ro
  - ru
  - sa
  - sd
  - si
  - sk
  - sl
  - so
  - sq
  - sr
  - su
  - sv
  - sw
  - ta
  - te
  - th
  - tl
  - tr
  - ug
  - uk
  - ur
  - uz
  - vi
  - xh
  - yi
  - zh

A quantized version of multilingual-e5-small. Quantization was performed per-layer under the same conditions as our ELSERv2 model, as described here.

Please note that the PyTorch traced model is runnable only on Linux with Intel CPUs.

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

Benchmarks

We performed a number of small benchmarks to assess both the changes in quality as well as inference latency against the baseline original model.

Quality

Measuring NDCG@10 using the dev split of the MIRACL datasets for select languages, we see mostly a marginal change in quality of the quantized model.

	de	yo	ru	ar	es	th
multilingual-e5-small	0.75862	0.56193	0.80309	0.82778	0.81672	0.85072
multilingual-e5-small-optimized	0.75992	0.48934	0.79668	0.82017	0.8135	0.84316

To test the English out-of-domain performance, we used the test split of various datasets in the BEIR evaluation. Measuring NDCG@10, we see a larger change in SCIFACT, but marginal in the other datasets evaluated.

	FIQA	SCIFACT	nfcorpus
multilingual-e5-small	0.33126	0.677	0.31004
multilingual-e5-small-optimized	0.31734	0.65484	0.30126

Performance

Using a PyTorch model traced for Linux and Intel CPUs, we performed performance benchmarking with various lengths of input. Overall, we see on average a 50-20% performance improvement with the optimized model.

input length (characters)	multilingual-e5-small	multilingual-e5-small-optimized	speedup
0 - 50	0.0181	0.00826	54.36%
50 - 100	0.0275	0.0164	40.36%
100 - 150	0.0366	0.0237	35.25%
150 - 200	0.0435	0.0301	30.80%
200 - 250	0.0514	0.0379	26.26%
250 - 300	0.0569	0.043	24.43%
300 - 350	0.0663	0.0513	22.62%
350 - 400	0.0737	0.0576	21.85%

Terms of Use

Customers may add third party trained models for management in Elastic. These models are not owned by Elastic. Customers must contract separately with the third party model owner for the use of the model, and such use will be governed by the applicable terms and conditions. You understand and agree that Elastic has no control over, or liability for, the third party models.