Papers
arxiv:2502.09082

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Published on Feb 13
ยท Submitted by Neph0s on Feb 14
Authors:
,
,
,
,
,
,
,
,

Abstract

Role-playing language agents (RPLAs) have emerged as promising applications of large language models (LLMs). However, simulating established characters presents a challenging task for RPLAs, due to the lack of authentic character datasets and nuanced evaluation methods using such data. In this paper, we present CoSER, a collection of a high-quality dataset, open models, and an evaluation protocol towards effective RPLAs of established characters. The CoSER dataset covers 17,966 characters from 771 renowned books. It provides authentic dialogues with real-world intricacies, as well as diverse data types such as conversation setups, character experiences and internal thoughts. Drawing from acting methodology, we introduce given-circumstance acting for training and evaluating role-playing LLMs, where LLMs sequentially portray multiple characters in book scenes. Using our dataset, we develop CoSER 8B and CoSER 70B, i.e., advanced open role-playing LLMs built on LLaMA-3.1 models. Extensive experiments demonstrate the value of the CoSER dataset for RPLA training, evaluation and retrieval. Moreover, CoSER 70B exhibits state-of-the-art performance surpassing or matching GPT-4o on our evaluation and three existing benchmarks, i.e., achieving 75.80% and 93.47% accuracy on the InCharacter and LifeChoice benchmarks respectively.

Community

Paper submitter

๐Ÿ“ข Introducing CoSER: Advancing AI Character Role-Playing with High-Quality Data from Best-Ever Books

We're excited to present CoSER (Coordinating LLM-Based Persona Simulation of Established Roles), a collection of a high-quality dataset, open models, and novel evaluation protocol for more authentic AI character role-playing!

๐Ÿ”‘ Key Features:
๐Ÿ“š 17,966 characters, 29,798 authentic conversations, from 771 renowned books (top-rated on the Best-Books-Ever List)
๐Ÿ“Š Comprehensive data of other types, such as conversation setups, character experiences and internal thoughts.
๐Ÿ† State-of-the-art role-playing models: CoSER-8B and CoSER-70B (built upon LLaMA 3.1)
๐ŸŽญ Novel Methodology for training and evaluating role-playing LLMs: Given-Circumstance Acting.

๐Ÿ“„ Paper: https://arxiv.org/pdf/2502.09082
๐Ÿ’ป Source code: https://github.com/Neph0s/COSER

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.09082 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.09082 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.09082 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.