56 163 609

Gabriele Sarti

gsarti

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

updated a collection about 4 hours ago

🔍 Interpretability & Analysis of LMs

upvoted a paper about 4 hours ago

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

liked a dataset 1 day ago

WestlakeNLP/DeepReview-13K

View all activity

Organizations

gsarti's activity

upvoted a paper about 4 hours ago

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

Paper • 2502.12892 • Published 23 days ago • 1

upvoted an article 3 days ago

Article

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

and 3 others •

3 days ago

• 115

upvoted a paper 3 days ago

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published 6 days ago • 71

upvoted a paper 7 days ago

QE4PE: Word-level Quality Estimation for Human Post-Editing

Paper • 2503.03044 • Published 9 days ago • 6

upvoted 2 papers 8 days ago

Wikipedia in the Era of LLMs: Evolution and Risks

Paper • 2503.02879 • Published 9 days ago • 20

A Close Look at Decomposition-based XAI-Methods for Transformer Language Models

Paper • 2502.15886 • Published 20 days ago • 1

upvoted a paper 10 days ago

Position-aware Automatic Circuit Discovery

Paper • 2502.04577 • Published Feb 7 • 1

upvoted a paper 24 days ago

We Can't Understand AI Using our Existing Vocabulary

Paper • 2502.07586 • Published about 1 month ago • 10

upvoted a paper about 1 month ago

ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 24

upvoted a collection about 1 month ago

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated Feb 2 • 56

upvoted an article about 1 month ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.16k

upvoted 2 papers about 1 month ago

Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution

Paper • 2501.18887 • Published Jan 31 • 1

Propositional Interpretability in Artificial Intelligence

Paper • 2501.15740 • Published Jan 27 • 1

upvoted an article about 1 month ago

Article

Open-R1: Update #1

and 7 others •

Feb 2

• 295

upvoted 4 papers about 1 month ago

upvoted a collection about 2 months ago

Gemma Neogenesis 💎🌍🇮🇹

Collection

Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy • 12 items • Updated 3 days ago • 5

upvoted a paper about 2 months ago

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Paper • 2501.08319 • Published Jan 14 • 10