DistilBERT release Collection Original DistilBERT model, checkpoints obtained from using teacher-student learning from the original BERT checkpoints. • 6 items • Updated Apr 17, 2024 • 18
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis Paper • 2309.12792 • Published Sep 22, 2023 • 1
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24, 2024 • 47
Zurich 1.5B (GGUF) Collection Quantized versions of Zurich 1.5B Model Collection, compatible with llama.cpp. Quantized by mradermacher. • 12 items • Updated 5 days ago • 2
Geneva 12B (GGUF) Collection Quantized versions of Geneva 12B Model Collection, compatible with llama.cpp. Quantized by mradermacher. • 12 items • Updated 7 days ago • 2
Zurich 14B (GGUF) Collection Quantized versions of Zurich 14B Model Collection, compatible with llama.cpp. Quantized by mradermacher. • 12 items • Updated about 22 hours ago • 3
Zurich 7B (GGUF) Collection Quantized versions of Zurich 7B Model Collection, compatible with llama.cpp. Quantized by mradermacher. • 12 items • Updated about 22 hours ago • 3
Zurich 1.5B Collection The Zurich 1.5B Model Collection - Fine-tuned from Qwen 2.5 1.5B Instruct with GammaCorpus v2. • 6 items • Updated 8 days ago • 2
GammaCorpus (CoT) Collection The GammaCorpus Dataset Collection for CoT (Chain of Thought) • 1 item • Updated 8 days ago • 9
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published 13 days ago • 22
Video Generation models Collection The domain of video generation is booming. Here are the list of selected Open Access video generation (T2V) models. • 14 items • Updated Aug 27, 2024 • 14
Geneva 12B Collection The Geneva Model Collection - Fine-tuned from Mistral Nemo Instruct 2407 (12B) with GammaCorpus v2. • 7 items • Updated 8 days ago • 10