Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy
Abstract
A test-time strategy inspired by System-2 cognitive processes decomposes large counting tasks into smaller sub-problems, enabling large language models to overcome architectural limitations and achieve high accuracy on complex counting tasks through mechanistic components like latent count computation, dedicated attention heads, and final aggregation stages.
Large language models (LLMs), despite strong performance on complex mathematical problems, exhibit systematic limitations in counting tasks. This issue arises from architectural limits of transformers, where counting is performed across layers, leading to degraded precision for larger counting problems due to depth constraints. To address this limitation, we propose a simple test-time strategy inspired by System-2 cognitive processes that decomposes large counting tasks into smaller, independent sub-problems that the model can reliably solve. We evaluate this approach using observational and causal mediation analyses to understand the underlying mechanism of this System-2-like strategy. Our mechanistic analysis identifies key components: latent counts are computed and stored in the final item representations of each part, transferred to intermediate steps via dedicated attention heads, and aggregated in the final stage to produce the total count. Experimental results demonstrate that this strategy enables LLMs to surpass architectural limitations and achieve high accuracy on large-scale counting tasks. This work provides mechanistic insight into System-2 counting in LLMs and presents a generalizable approach for improving and understanding their reasoning behavior.
Community
๐ข Overcoming Transformer Depth Limits in Counting Tasks
LLMs often fail at counting not because they aren't smart, but because of architectural depth constraints ๐ง. We propose a simple, effective System-2 strategy ๐งฉ that decomposes counting tasks to bypass these limits.
๐ฌ We also provide a full mechanistic interpretation, identifying the specific attention heads and representations responsible for transferring "latent counts" across the network.
๐ This approach allows LLMs to achieve high accuracy on large-scale counting benchmarks where they typically fail.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Understanding Counting Mechanisms in Large Language and Vision-Language Models (2025)
- SSR: Socratic Self-Refine for Large Language Model Reasoning (2025)
- Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning (2025)
- Sequential Enumeration in Large Language Models (2025)
- Nexus: Higher-Order Attention Mechanisms in Transformers (2025)
- Can Vision-Language Models Count? A Synthetic Benchmark and Analysis of Attention-Based Interventions (2025)
- Limits and Gains of Test-Time Scaling in Vision-Language Reasoning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
