view article Article From Llasa to Llasagna π: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other β’ 3 days ago β’ 19
view article Article The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about... By srinivasbilla β’ 25 days ago β’ 60
view article Article How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents By Steveeeeeeen β’ 16 days ago β’ 16
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 10 days ago β’ 161
view article Article Ο0 and Ο0-FAST: Vision-Language-Action Models for General Robot Control 11 days ago β’ 93
view article Article π Deploying OLMo-7B with Text Generation Inference (TGI) on Hugging Face Spaces By ariG23498 β’ 12 days ago β’ 5
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper β’ 2501.18512 β’ Published 15 days ago β’ 25
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain β’ 16 days ago β’ 27
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ 22 days ago β’ 62
NeMo Audio Codecs Collection A series of Neural Audio Codecs β’ 5 items β’ Updated 28 days ago β’ 11
Qwen2-Audio Collection Audio-language model series based on Qwen2 β’ 4 items β’ Updated Nov 28, 2024 β’ 51
view article Article Halo: Open Source Health Tracking with Wearables By cyrilzakka β’ Nov 19, 2024 β’ 106