Text Generation
Transformers
PyTorch
Safetensors
English
Chinese
llama
text-generation-inference
Inference Endpoints

MiniLoong-3B

πŸ“‘ arXiv | πŸ‘» GitHub | πŸ€— HuggingFace-MiniMA-3B | πŸ€— HuggingFace-MiniChat-3B | πŸ€– ModelScope-MiniMA-3B | πŸ€– ModelScope-MiniChat-3B | πŸ€— HuggingFace-MiniChat-1.5-3B | πŸ€— HuggingFace-MiniMA-2-3B | πŸ€— HuggingFace-MiniChat-2-3B | πŸ€— HuggingFace-MiniMA-2-1B | πŸ€— HuggingFace-MiniLoong-3B | πŸ€— HuggingFace-MiniMix-2/4x3B

❗ Must comply with LICENSE of LLaMA-2 since it is derived from LLaMA-2.

teaser_d

Bibtex

@article{zhang2023law,
    title={Towards the Law of Capacity Gap in Distilling Language Models},
    author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
    year={2023},
    url={https://arxiv.org/abs/2311.07052}
}
Downloads last month
308
Safetensors
Model size
3.02B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for GeneZC/MiniLoong-3B

Quantizations
1 model

Datasets used to train GeneZC/MiniLoong-3B

Space using GeneZC/MiniLoong-3B 1

Collection including GeneZC/MiniLoong-3B