Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,10 @@
|
|
1 |
-
#
|
2 |
|
|
|
3 |
|
4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Automatic Evaluation Model for RAIDEN Benchmark
|
2 |
|
3 |
+
This repository contains the automated evaluation model trained as part of the research presented in the paper "RAIDEN Benchmark: Evaluating Role-playing Conversational Agents with Measurement-Driven Custom Dialogues".
|
4 |
|
5 |
+
The model is designed to compare the quality of two different responses in a given dialogue turn and produce one of three evaluation outcomes: win , tie , or lose .
|
6 |
+
|
7 |
+
For more detailed information, please refer to our paper and code:
|
8 |
+
|
9 |
+
Paper: https://aclanthology.org/2025.coling-main.735.pdf
|
10 |
+
GitHub repo: https://github.com/FrontierLabs/RAIDEN
|