modularStarEncoder
/

ModularStarEncoder-finetuned

Feature Extraction

ModularStarEncoder

Model card Files Files and versions Community

andreagurioli1995 commited on May 20

Commit

b53e15a

·

verified ·

1 Parent(s): a63a9cf

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ We built ModularStarEncoder on top of [StarCoder-2](https://huggingface.co/bigco
 The model is finetuned with [CLIP objective](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/loss.py).
 ModularStarEncoder fine-tuned works with instruction prompts; to get the most out of the model, embed the task in the input. The How to Use section below provides more details.
-- **Paper:** [One Model to Train them All: Hierarchical Self-Distillation for Enhanced Early Layer Embeddings](https://arxiv.org/abs/2503.03008)
 - **Languages:** English, Go, Ruby, Python, Java, C++, PHP, C, JavaScript
 - **Different sizes:**  [Layer 4](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-4), [Layer 9](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-9), [Layer 18](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-18), [Layer 27](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-27), [Layer 36](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned)
@@ -101,8 +101,8 @@ The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can
 # Citation
 ```
-@article{gurioli2025modeltrainallhierarchical,
-      title={One Model to Train them All: Hierarchical Self-Distillation for Enhanced Early Layer Embeddings},
       author={Andrea Gurioli and Federico Pennino and João Monteiro and Maurizio Gabbrielli},
       year={2025},
       eprint={2503.03008},

 The model is finetuned with [CLIP objective](https://github.com/mlfoundations/open_clip/blob/main/src/open_clip/loss.py).
 ModularStarEncoder fine-tuned works with instruction prompts; to get the most out of the model, embed the task in the input. The How to Use section below provides more details.
+- **Paper:** [MoSE: Hierarchical Self-Distillation Enhances Early Layer Embeddings](https://arxiv.org/abs/2503.03008)
 - **Languages:** English, Go, Ruby, Python, Java, C++, PHP, C, JavaScript
 - **Different sizes:**  [Layer 4](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-4), [Layer 9](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-9), [Layer 18](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-18), [Layer 27](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned-27), [Layer 36](https://huggingface.co/modularStarEncoder/ModularStarEncoder-finetuned)
 # Citation
 ```
+@article{gurioli2025mosehierarchicalselfdistillationenhances,
+      title={MoSE: Hierarchical Self-Distillation Enhances Early Layer Embeddings},
       author={Andrea Gurioli and Federico Pennino and João Monteiro and Maurizio Gabbrielli},
       year={2025},
       eprint={2503.03008},