FrontierLab
/

RPCAJudger

Model card Files Files and versions Community

jasonlp commited on Mar 4

Commit

6738396

·

verified ·

1 Parent(s): c091507

Update README.md

Files changed (1) hide show

README.md +8 -2

README.md CHANGED Viewed

@@ -1,4 +1,10 @@
-# RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues
-Our GitHub repo: https://github.com/FrontierLabs/RAIDEN

+# Automatic Evaluation Model for RAIDEN Benchmark
+This repository contains the automated evaluation model trained as part of the research presented in the paper "RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues".
+The model is designed to compare the quality of two different responses in a given dialogue turn and produce one of three evaluation outcomes: win , tie , or lose .
+For more detailed information, please refer to our paper and code:
+Paper: https://aclanthology.org/2025.coling-main.735.pdf
+GitHub repo: https://github.com/FrontierLabs/RAIDEN