Spaces:

sparse-encoder
/

README

Running

File size: 4,098 Bytes

4f75e8b
 
96f7b00
 
 
4f75e8b
 
 
96f7b00
4f75e8b
96f7b00

---
title: README
emoji: ❤️
colorFrom: red
colorTo: red
sdk: static
pinned: false
---
SentenceTransformers 🤗 is a Python framework for state-of-the-art sentence, text and image embeddings. 

Install the [Sentence Transformers](https://www.sbert.net/) library.
```
pip install -U sentence-transformers
```

The usage is as simple as:
```python
from sentence_transformers import SparseEncoder

# 1. Load a pretrained SparseEncoder model
model = SparseEncoder("naver/splade-cocondenser-ensembledistil")

# The sentences to encode
sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]

# 2. Calculate sparse embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 30522] - sparse representation with vocabulary size dimensions

# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[   35.629,     9.154,     0.098],
#         [    9.154,    27.478,     0.019],
#         [    0.098,     0.019,    29.553]])

# 4. Check sparsity stats
stats = SparseEncoder.sparsity(embeddings)
print(f"Sparsity: {stats['sparsity_ratio']:.2%}")
# Sparsity: 99.84%
```

Hugging Face makes it easy to collaboratively build and showcase your [Sentence Transformers](https://www.sbert.net/) models! You can collaborate with your organization, upload and showcase your own models in your profile ❤️

<div class="grid lg:grid-cols-3 gap-x-4 gap-y-7">
<a href="https://www.sbert.net/" class="block overflow-hidden group">
   <div
      class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center bg-[#FA8072]"
      >
      <img alt="" src="https://huggingface.co/spaces/sparse-encoder/README/resolve/main/sbertLogo.png" class="w-40" />
   </div>
   <div class="underline">Documentation</div>
</a>
<a
   href="https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.push_to_hub"
   class="block overflow-hidden group"
   >
   <div
      class="w-full h-40 mb-2 bg-gray-900 group-hover:bg-gray-850 rounded-lg flex items-start justify-start overflow-hidden"
      >
      <img
         alt=""
         src="https://huggingface.co/spaces/sparse-encoder/README/resolve/main/push-to-hub.png"
         class="w-full h-40 object-cover overflow-hidden"
         />
   </div>
   <div class="underline">Push your Sentence Transformers models to the Hub ❤️ </div>
</a>
<a
   href="https://huggingface.co/models?library=sentence-transformers&other=sparse&sort=downloads"
   class="block overflow-hidden group"
   >
   <div
      class="w-full h-40 mb-2 bg-gray-900 group-hover:bg-gray-850 rounded-lg flex items-start justify-start overflow-hidden"
      >
      <img
         alt=""
         src="https://huggingface.co/spaces/sparse-encoder/README/resolve/main/sbert-hf.png"
         class="w-full h-40 object-cover overflow-hidden"
         />
   </div>
   <div class="underline">Find all SparseEncoder models on the 🤗 Hub</div>
</a>
</div>

To upload your SparseEncoder models to the Hugging Face Hub, log in with `huggingface-cli login` and use the [`push_to_hub`](https://sbert.net/docs/package_reference/sparse_encoder/SparseEncoder.html#sentence_transformers.sparse_encoder.SparseEncoder.push_to_hub) method within the Sentence Transformers library.
```python
from sentence_transformers import SparseEncoder

# Load or train a model
model = SparseEncoder(...)
# Push to Hub
model.push_to_hub("my_new_model")
```

Note that this repository hosts for now only examples of sparse-encoder models from the SentenceTransformers package that can be easily reproduced with the different training script examples. 

More details at [Sparse Encoder > Training Examples](https://sbert.net/docs/sparse_encoder/training/examples.html) for the examples scripts and [Sparse Encoder > Pretrained Models](https://sbert.net/docs/sparse_encoder/pretrained_models.html) for the community pre-trained models, that you can also found for some of them in the following collections.