Community Blog & Articles

Community Articles

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

The Optimal Architecture for Small Language Models

🥃 Distilling Tiny Embeddings

SmolLM-Smashed: Tiny Giants, Optimized for Speed

M2.1: Multilingual and Multi-Task Coding with Strong Generalization

Introducing Falcon H1R 7B

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

Uncensor any LLM with abliteration

Building Autonomous Vehicles That Reason with the NVIDIA Alpamayo Open Ecosystem

KV Caching Explained: Optimizing Transformer Inference Efficiency

Common AI Model Formats

How I Trained Action Chunking Transformer (ACT) on SO-101: My Journey, Gotchas, and Lessons

TFLOPS Gap: Why FP4 MoE Kernel Engineering Matters on Blackwell

Proof of Time: A Benchmark for Evaluating Scientific Idea Judgments

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Small Language Models (SLM): A Comprehensive Overview

From Image-to-LoRA to In-Context Edit

Reviewer Two (but it's an OpenEnv)

Introduction to State Space Models (SSM)

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

January 5, 2026

Introducing Falcon-H1-Arabic: Pushing the Boundaries of Arabic Language AI with Hybrid Architecture

January 5, 2026

partnershipsnvidiarobotics

NVIDIA brings agents to life with DGX Spark and Reachy Mini

January 5, 2026

AprielGuard: A Guardrail for Safety and Adversarial Robustness in Modern LLM Systems

December 23, 2025

tokenizerstransformersopen-source

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+2

December 18, 2025

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

December 17, 2025

CUGA on Hugging Face: Democratizing Configurable AI Agents

December 15, 2025

New in llama.cpp: Model Management

December 11, 2025

llmfine-tuningopen-source

Codex is Open Sourcing AI models

December 11, 2025

swifthubopen-source

Introducing swift-huggingface: The Complete Swift Client for Hugging Face

December 5, 2025

llmreasoningagents

DeepMath: A lightweight math reasoning Agent with smolagents

December 4, 2025

llmfine-tuningopen-source

We Got Claude to Fine-Tune an Open Source LLM

December 4, 2025

transformersv5community

Transformers v5: Simple model definitions powering the AI ecosystem

December 1, 2025

diffusersfluxquantization

Diffusers welcomes FLUX-2

+4

November 25, 2025

Community Articles

NEW Articles from Team or Enterprise organizations will get promoted to the main section.

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

The Optimal Architecture for Small Language Models

🥃 Distilling Tiny Embeddings

SmolLM-Smashed: Tiny Giants, Optimized for Speed

M2.1: Multilingual and Multi-Task Coding with Strong Generalization

Introducing Falcon H1R 7B

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

Uncensor any LLM with abliteration

Building Autonomous Vehicles That Reason with the NVIDIA Alpamayo Open Ecosystem

KV Caching Explained: Optimizing Transformer Inference Efficiency

Common AI Model Formats

How I Trained Action Chunking Transformer (ACT) on SO-101: My Journey, Gotchas, and Lessons

TFLOPS Gap: Why FP4 MoE Kernel Engineering Matters on Blackwell

Proof of Time: A Benchmark for Evaluating Scientific Idea Judgments

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Small Language Models (SLM): A Comprehensive Overview

From Image-to-LoRA to In-Context Edit

Reviewer Two (but it's an OpenEnv)

Introduction to State Space Models (SSM)

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

View all articles