Add new SentenceTransformer model
Browse files- 2_Dense/model.safetensors +1 -1
- 3_Dense/model.safetensors +1 -1
- README.md +65 -70
- config.json +1 -1
- model.safetensors +2 -2
2_Dense/model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2362528
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9997181ec203c76a0e08ecba57c47a10999519c2736241efc55aadbd8d389584
|
3 |
size 2362528
|
3_Dense/model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2362528
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:db470fd6a6c46fd748b3e0d97974cb3788a47741d1005aca5aff6ccc250b737c
|
3 |
size 2362528
|
README.md
CHANGED
@@ -12,53 +12,48 @@ tags:
|
|
12 |
- retrieval
|
13 |
- reranking
|
14 |
- generated_from_trainer
|
15 |
-
- dataset_size:
|
16 |
- loss:ArcFaceInBatchLoss
|
17 |
base_model: Alibaba-NLP/gte-modernbert-base
|
18 |
widget:
|
19 |
-
- source_sentence:
|
20 |
-
|
21 |
sentences:
|
22 |
-
-
|
23 |
-
-
|
24 |
-
|
25 |
-
-
|
26 |
-
|
27 |
-
- source_sentence: All tracks produced by Zack Shada , Jeremy Shada , Logan Charles
|
28 |
-
, John Spicer and Seth Renken . All tracks are written by Zack Odom and Kenneth
|
29 |
-
Mount .
|
30 |
sentences:
|
31 |
-
-
|
32 |
-
|
33 |
-
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
and Tacopaya Municipality is located in the west .
|
38 |
sentences:
|
39 |
-
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
- source_sentence: Browning is identified as married , but no wife or child is captured
|
47 |
-
.
|
48 |
sentences:
|
49 |
-
-
|
50 |
-
|
51 |
-
-
|
52 |
-
|
53 |
-
-
|
54 |
-
|
|
|
55 |
sentences:
|
56 |
-
-
|
57 |
-
|
58 |
-
-
|
59 |
-
|
60 |
-
-
|
61 |
-
|
62 |
datasets:
|
63 |
- redis/langcache-sentencepairs-v2
|
64 |
pipeline_tag: sentence-similarity
|
@@ -160,9 +155,9 @@ from sentence_transformers import SentenceTransformer
|
|
160 |
model = SentenceTransformer("redis/langcache-embed-v3")
|
161 |
# Run inference
|
162 |
sentences = [
|
163 |
-
'
|
164 |
-
'
|
165 |
-
'
|
166 |
]
|
167 |
embeddings = model.encode(sentences)
|
168 |
print(embeddings.shape)
|
@@ -171,9 +166,9 @@ print(embeddings.shape)
|
|
171 |
# Get the similarity scores for the embeddings
|
172 |
similarities = model.similarity(embeddings, embeddings)
|
173 |
print(similarities)
|
174 |
-
# tensor([[
|
175 |
-
# [
|
176 |
-
# [0.
|
177 |
```
|
178 |
|
179 |
<!--
|
@@ -239,19 +234,19 @@ You can finetune this model on your own dataset.
|
|
239 |
#### LangCache Sentence Pairs (all)
|
240 |
|
241 |
* Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2)
|
242 |
-
* Size:
|
243 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
244 |
* Approximate statistics based on the first 1000 samples:
|
245 |
-
| | anchor
|
246 |
-
|
247 |
-
| type | string
|
248 |
-
| details | <ul><li>min:
|
249 |
* Samples:
|
250 |
-
| anchor
|
251 |
-
|
252 |
-
| <code>
|
253 |
-
| <code>
|
254 |
-
| <code>
|
255 |
* Loss: <code>losses.ArcFaceInBatchLoss</code> with these parameters:
|
256 |
```json
|
257 |
{
|
@@ -266,19 +261,19 @@ You can finetune this model on your own dataset.
|
|
266 |
#### LangCache Sentence Pairs (all)
|
267 |
|
268 |
* Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2)
|
269 |
-
* Size:
|
270 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
271 |
* Approximate statistics based on the first 1000 samples:
|
272 |
-
| | anchor
|
273 |
-
|
274 |
-
| type | string
|
275 |
-
| details | <ul><li>min:
|
276 |
* Samples:
|
277 |
-
| anchor
|
278 |
-
|
279 |
-
| <code>
|
280 |
-
| <code>
|
281 |
-
| <code>
|
282 |
* Loss: <code>losses.ArcFaceInBatchLoss</code> with these parameters:
|
283 |
```json
|
284 |
{
|
@@ -292,8 +287,8 @@ You can finetune this model on your own dataset.
|
|
292 |
#### Non-Default Hyperparameters
|
293 |
|
294 |
- `eval_strategy`: steps
|
295 |
-
- `per_device_train_batch_size`:
|
296 |
-
- `per_device_eval_batch_size`:
|
297 |
- `gradient_accumulation_steps`: 2
|
298 |
- `weight_decay`: 0.001
|
299 |
- `adam_beta2`: 0.98
|
@@ -319,8 +314,8 @@ You can finetune this model on your own dataset.
|
|
319 |
- `do_predict`: False
|
320 |
- `eval_strategy`: steps
|
321 |
- `prediction_loss_only`: True
|
322 |
-
- `per_device_train_batch_size`:
|
323 |
-
- `per_device_eval_batch_size`:
|
324 |
- `per_gpu_train_batch_size`: None
|
325 |
- `per_gpu_eval_batch_size`: None
|
326 |
- `gradient_accumulation_steps`: 2
|
@@ -439,7 +434,7 @@ You can finetune this model on your own dataset.
|
|
439 |
### Training Logs
|
440 |
| Epoch | Step | Validation Loss | test_cosine_ndcg@10 |
|
441 |
|:-----:|:----:|:---------------:|:-------------------:|
|
442 |
-
| 0 | 0 |
|
443 |
|
444 |
|
445 |
### Framework Versions
|
|
|
12 |
- retrieval
|
13 |
- reranking
|
14 |
- generated_from_trainer
|
15 |
+
- dataset_size:1460771
|
16 |
- loss:ArcFaceInBatchLoss
|
17 |
base_model: Alibaba-NLP/gte-modernbert-base
|
18 |
widget:
|
19 |
+
- source_sentence: '"How much would I need to narrate a ""Let''s Play"" video in order
|
20 |
+
to make money from it on YouTube?"'
|
21 |
sentences:
|
22 |
+
- How much money do people make from YouTube videos with 1 million views?
|
23 |
+
- '"How much would I need to narrate a ""Let''s Play"" video in order to make money
|
24 |
+
from it on YouTube?"'
|
25 |
+
- '"Does the sentence, ""I expect to be disappointed,"" make sense?"'
|
26 |
+
- source_sentence: '"I appreciate that.'
|
|
|
|
|
|
|
27 |
sentences:
|
28 |
+
- '"How is the Mariner rewarded in ""The Rime of the Ancient Mariner"" by Samuel
|
29 |
+
Taylor Coleridge?"'
|
30 |
+
- '"I appreciate that.'
|
31 |
+
- I can appreciate that.
|
32 |
+
- source_sentence: '"""It is very easy to defeat someone, but too hard to win some
|
33 |
+
one"". What does the previous sentence mean?"'
|
|
|
34 |
sentences:
|
35 |
+
- '"How can you use the word ""visceral"" in a sentence?"'
|
36 |
+
- '"""It is very easy to defeat someone, but too hard to win some one"". What does
|
37 |
+
the previous sentence mean?"'
|
38 |
+
- '"What does ""The loudest one in the room is the weakest one in the room."" Mean?"'
|
39 |
+
- source_sentence: '" We condemn this raid which is in our view illegal and morally
|
40 |
+
and politically unjustifiable , " London-based NCRI official Ali Safavi told Reuters
|
41 |
+
by telephone .'
|
|
|
|
|
42 |
sentences:
|
43 |
+
- 'London-based NCRI official Ali Safavi told Reuters : " We condemn this raid ,
|
44 |
+
which is in our view illegal and morally and politically unjustifiable . "'
|
45 |
+
- The social awkwardness is complicated by the fact that Marianne is a white girl
|
46 |
+
living with a black family .
|
47 |
+
- art's cause, this in my opinion
|
48 |
+
- source_sentence: '"If you click ""like"" on an old post that someone made on your
|
49 |
+
wall yet you''re no longer Facebook friends, will they still receive a notification?"'
|
50 |
sentences:
|
51 |
+
- '"Is there is any two wheeler having a gear box which has the feature ""automatic
|
52 |
+
neutral"" when the engine is off while it is in gear?"'
|
53 |
+
- '"If you click ""like"" on an old post that someone made on your wall yet you''re
|
54 |
+
no longer Facebook friends, will they still receive a notification?"'
|
55 |
+
- '"If your teenage son posted ""La commedia e finita"" on his Facebook wall, would
|
56 |
+
you be concerned?"'
|
57 |
datasets:
|
58 |
- redis/langcache-sentencepairs-v2
|
59 |
pipeline_tag: sentence-similarity
|
|
|
155 |
model = SentenceTransformer("redis/langcache-embed-v3")
|
156 |
# Run inference
|
157 |
sentences = [
|
158 |
+
'"If you click ""like"" on an old post that someone made on your wall yet you\'re no longer Facebook friends, will they still receive a notification?"',
|
159 |
+
'"If you click ""like"" on an old post that someone made on your wall yet you\'re no longer Facebook friends, will they still receive a notification?"',
|
160 |
+
'"If your teenage son posted ""La commedia e finita"" on his Facebook wall, would you be concerned?"',
|
161 |
]
|
162 |
embeddings = model.encode(sentences)
|
163 |
print(embeddings.shape)
|
|
|
166 |
# Get the similarity scores for the embeddings
|
167 |
similarities = model.similarity(embeddings, embeddings)
|
168 |
print(similarities)
|
169 |
+
# tensor([[1.0000, 1.0000, 0.2617],
|
170 |
+
# [1.0000, 1.0000, 0.2617],
|
171 |
+
# [0.2617, 0.2617, 1.0000]])
|
172 |
```
|
173 |
|
174 |
<!--
|
|
|
234 |
#### LangCache Sentence Pairs (all)
|
235 |
|
236 |
* Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2)
|
237 |
+
* Size: 132,354 training samples
|
238 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
239 |
* Approximate statistics based on the first 1000 samples:
|
240 |
+
| | anchor | positive | negative |
|
241 |
+
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
242 |
+
| type | string | string | string |
|
243 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 25.33 tokens</li><li>max: 100 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 24.98 tokens</li><li>max: 100 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 19.06 tokens</li><li>max: 68 tokens</li></ul> |
|
244 |
* Samples:
|
245 |
+
| anchor | positive | negative |
|
246 |
+
|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|
|
247 |
+
| <code> What high potential jobs are there other than computer science?</code> | <code> What high potential jobs are there other than computer science?</code> | <code>Why IT or Computer Science jobs are being over rated than other Engineering jobs?</code> |
|
248 |
+
| <code> Would India ever be able to develop a missile system like S300 or S400 missile?</code> | <code> Would India ever be able to develop a missile system like S300 or S400 missile?</code> | <code>Should India buy the Russian S400 air defence missile system?</code> |
|
249 |
+
| <code> water from the faucet is being drunk by a yellow dog</code> | <code>A yellow dog is drinking water from the faucet</code> | <code>Childlessness is low in Eastern European countries.</code> |
|
250 |
* Loss: <code>losses.ArcFaceInBatchLoss</code> with these parameters:
|
251 |
```json
|
252 |
{
|
|
|
261 |
#### LangCache Sentence Pairs (all)
|
262 |
|
263 |
* Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v2)
|
264 |
+
* Size: 132,354 evaluation samples
|
265 |
* Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
|
266 |
* Approximate statistics based on the first 1000 samples:
|
267 |
+
| | anchor | positive | negative |
|
268 |
+
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
269 |
+
| type | string | string | string |
|
270 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 25.33 tokens</li><li>max: 100 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 24.98 tokens</li><li>max: 100 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 19.06 tokens</li><li>max: 68 tokens</li></ul> |
|
271 |
* Samples:
|
272 |
+
| anchor | positive | negative |
|
273 |
+
|:----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|
|
274 |
+
| <code> What high potential jobs are there other than computer science?</code> | <code> What high potential jobs are there other than computer science?</code> | <code>Why IT or Computer Science jobs are being over rated than other Engineering jobs?</code> |
|
275 |
+
| <code> Would India ever be able to develop a missile system like S300 or S400 missile?</code> | <code> Would India ever be able to develop a missile system like S300 or S400 missile?</code> | <code>Should India buy the Russian S400 air defence missile system?</code> |
|
276 |
+
| <code> water from the faucet is being drunk by a yellow dog</code> | <code>A yellow dog is drinking water from the faucet</code> | <code>Childlessness is low in Eastern European countries.</code> |
|
277 |
* Loss: <code>losses.ArcFaceInBatchLoss</code> with these parameters:
|
278 |
```json
|
279 |
{
|
|
|
287 |
#### Non-Default Hyperparameters
|
288 |
|
289 |
- `eval_strategy`: steps
|
290 |
+
- `per_device_train_batch_size`: 8192
|
291 |
+
- `per_device_eval_batch_size`: 8192
|
292 |
- `gradient_accumulation_steps`: 2
|
293 |
- `weight_decay`: 0.001
|
294 |
- `adam_beta2`: 0.98
|
|
|
314 |
- `do_predict`: False
|
315 |
- `eval_strategy`: steps
|
316 |
- `prediction_loss_only`: True
|
317 |
+
- `per_device_train_batch_size`: 8192
|
318 |
+
- `per_device_eval_batch_size`: 8192
|
319 |
- `per_gpu_train_batch_size`: None
|
320 |
- `per_gpu_eval_batch_size`: None
|
321 |
- `gradient_accumulation_steps`: 2
|
|
|
434 |
### Training Logs
|
435 |
| Epoch | Step | Validation Loss | test_cosine_ndcg@10 |
|
436 |
|:-----:|:----:|:---------------:|:-------------------:|
|
437 |
+
| 0 | 0 | 2.9916 | 0.7718 |
|
438 |
|
439 |
|
440 |
### Framework Versions
|
config.json
CHANGED
@@ -12,7 +12,7 @@
|
|
12 |
"cls_token_id": 50281,
|
13 |
"decoder_bias": true,
|
14 |
"deterministic_flash_attn": false,
|
15 |
-
"dtype": "
|
16 |
"embedding_dropout": 0.0,
|
17 |
"eos_token_id": 50282,
|
18 |
"global_attn_every_n_layers": 3,
|
|
|
12 |
"cls_token_id": 50281,
|
13 |
"decoder_bias": true,
|
14 |
"deterministic_flash_attn": false,
|
15 |
+
"dtype": "float32",
|
16 |
"embedding_dropout": 0.0,
|
17 |
"eos_token_id": 50282,
|
18 |
"global_attn_every_n_layers": 3,
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:04aa7437b7f98ed3f652e300c1d767d07c1864c10b3055ea63831997faefa8d6
|
3 |
+
size 596070136
|