EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge Paper • 2601.09142 • Published 1 day ago • 1
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published 25 days ago • 16
DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published 25 days ago • 16