NemoGuard Collection Essential datasets and models for content safety, topic-following, and security guardrails β’ 13 items β’ Updated 11 days ago β’ 16
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper β’ 2503.11576 β’ Published Mar 14, 2025 β’ 125
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method β’ 19 items β’ Updated 15 days ago β’ 77
PP-StructureV3 Collection PP-StructureV3 is a SOTA document parsing solution on OmniDocBench, supporting the conversion of PDFs and do cument images to Markdown and JSON. β’ 17 items β’ Updated Sep 15, 2025 β’ 12
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano v3. β’ 7 items β’ Updated 11 days ago β’ 54
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. β’ 11 items β’ Updated 11 days ago β’ 85
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models β’ 6 items β’ Updated 4 days ago β’ 109
Bolmo: Byteifying the Next Generation of Language Models Paper β’ 2512.15586 β’ Published 17 days ago β’ 14
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory Paper β’ 2504.19413 β’ Published Apr 28, 2025 β’ 36
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper β’ 2512.16093 β’ Published 17 days ago β’ 90
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper β’ 2512.20605 β’ Published 11 days ago β’ 59