view article Article The case for specialized pre-training: ultra-fast foundation models for dedicated tasks By Pclanglais β’ Aug 4, 2024 β’ 29
Scotch & SOTA π₯ Pt. 7: Human Feedback Datasets π«£ Collection The elusive βhumanβ feedback β’ 1 item β’ Updated Sep 13, 2023 β’ 1
Scotch & SOTA π₯ Pt. 6: Dialogue Tuning Datasets π¬ Collection Conversations, turn-based dialog, and things that can be turned into that. β’ 4 items β’ Updated Sep 13, 2023 β’ 1
Scotch & SOTA π₯ Pt. 5: Instruction Tuning Datasets π©βπ« Collection Question & answer, task completion, general SFT and otherwise finetuney data. β’ 7 items β’ Updated Sep 13, 2023 β’ 1
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien β’ May 7, 2024 β’ 8
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 28 days ago β’ 142
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. β’ 33 items β’ Updated 1 day ago β’ 68
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog β’ 9 items β’ Updated 1 day ago β’ 59
Gemma 2: Improving Open Language Models at a Practical Size Paper β’ 2408.00118 β’ Published Jul 31, 2024 β’ 76
view article Article PyTorchModelHubMixin: Bridging the Gap for Custom AI Models on Hugging Face By not-lain and 1 other β’ Nov 11, 2024 β’ 16
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 β’ 40 items β’ Updated Nov 28, 2024 β’ 279
view article Article Recipe: Preparing Multilingual Speech Datasets for TTS Training By PHBJT and 1 other β’ Nov 4, 2024 β’ 18
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 β’ 9 items β’ Updated Nov 27, 2024 β’ 103