UniSumEval: Towards Unified, Fine-Grained, Multi-Dimensional Summarization Evaluation for LLMs Paper • 2409.19898 • Published Sep 30, 2024 • 1
ReFeed: Multi-dimensional Summarization Refinement with Reflective Reasoning on Feedback Paper • 2503.21332 • Published Mar 27 • 23
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 110
How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation Paper • 2312.17115 • Published Dec 28, 2023 • 2
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published May 23 • 15 • 2
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25 • 13 • 3
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published May 23 • 15
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25 • 13
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25 • 13
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published May 23 • 15