SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Abstract
We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks.
Community
- We designed a self-supervised reward to align LLMs for generating better citations to attribute the context when answering to questions, without human supervision.
- SelfCite leverages a reward signal provided by the LLM itself through context ablation
- If a citation is necessary, removing the cited text should prevent the same response
- if sufficient, retaining the cited text alone should preserve the same response
- This reward can guide 1) best-of-N sampling 2) fine-tuning with SimPO for generating better citations.
- Increasing citation F1 up to 5.3 points on the LongBench-Cite
- The result is comparable to the specialized commercial API of "Claude Citations" released in Jan 2025.
Implementation: https://github.com/voidism/SelfCite
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling (2024)
- Factuality or Fiction? Benchmarking Modern LLMs on Ambiguous QA with Citations (2024)
- ChainRank-DPO: Chain Rank Direct Preference Optimization for LLM Rankers (2024)
- RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response (2024)
- Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models (2025)
- CiteBART: Learning to Generate Citations for Local Citation Recommendation (2024)
- Redefining Simplicity: Benchmarking Large Language Models from Lexical to Document Simplification (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 2
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper