daiweichen
/

pal-b-large-opt-350m

feature-extraction

Model card Files Files and versions Community

daiweichen commited on about 1 month ago

Commit

e0ba201

·

verified ·

1 Parent(s): 69e6e40

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -12,9 +12,11 @@ base_model:
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
-This model is a personalized reward model for pluralistic alignment and also a demonstration for our [paper](https://pal-alignment.github.io/).
-We train the PAL-B-Large model on a variant of Reddit TL;DR summary dataset with 10 users who provide the most amount of feedbacks and achieve higher performance compared with vanilla homogeneous reward model.
 ## Model Details

 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+This model is a personalized reward model for pluralistic alignment and serves as a demonstration for our [paper](https://pal-alignment.github.io/).
+We train the PAL-B-Large model on a variant of Reddit TL;DR summary dataset, incorporating feedback from the 10 most active users.
+Our approach outperforms the standard homogeneous reward model, demonstrating improved performance with our proposed Pluralistic Alignment method.
 ## Model Details