Gemma 3 Collection All versions of Google's new multimodal models in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 25 items • Updated about 1 hour ago • 27
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 209
Polypsyche Core Collection Why limit AGI to one “super brain” when you can have multiple minds in one? Psyche of Logic, Psyche of Emotion, Psyche of Improv . • 5 items • Updated Dec 16, 2024 • 1
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 21 days ago • 246
DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models Paper • 2410.07331 • Published Oct 9, 2024 • 5
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? Paper • 2409.07703 • Published Sep 12, 2024 • 67
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 576
KTO: Model Alignment as Prospect Theoretic Optimization Paper • 2402.01306 • Published Feb 2, 2024 • 16
Fireball-Llama-3.1 collections Collection Fine-tuned Llama 3.1 with different approaches • 3 items • Updated Aug 19, 2024 • 1
EpistemeAI's codegemma-2-9b ggufs Collection EpistemeAI's fine-tune Gemma 2 9B gguf • 4 items • Updated Aug 19, 2024 • 1
Direct Preference Optimization Datasets Collection Datasets suitable for DPO based on having 'chosen', 'rejected', and 'prompt' columns. Created using librarian-bots/dataset-column-search-api • 5007 items • Updated Feb 8 • 6
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 43