daiweichen commited on
Commit
e0ba201
·
verified ·
1 Parent(s): 69e6e40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -12,9 +12,11 @@ base_model:
12
  # Model Card for Model ID
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
15
- This model is a personalized reward model for pluralistic alignment and also a demonstration for our [paper](https://pal-alignment.github.io/).
16
- We train the PAL-B-Large model on a variant of Reddit TL;DR summary dataset with 10 users who provide the most amount of feedbacks and achieve higher performance compared with vanilla homogeneous reward model.
17
 
 
 
 
18
 
19
  ## Model Details
20
 
 
12
  # Model Card for Model ID
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
15
+ This model is a personalized reward model for pluralistic alignment and serves as a demonstration for our [paper](https://pal-alignment.github.io/).
 
16
 
17
+ We train the PAL-B-Large model on a variant of Reddit TL;DR summary dataset, incorporating feedback from the 10 most active users.
18
+
19
+ Our approach outperforms the standard homogeneous reward model, demonstrating improved performance with our proposed Pluralistic Alignment method.
20
 
21
  ## Model Details
22