CodeGoat24 commited on
Commit
3bb25f7
Β·
verified Β·
1 Parent(s): 77c5232

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -11,7 +11,7 @@ base_model:
11
  This model is trained on LLaVA-Video-7B-Qwen2 based on DPO preference data constructed by our [UnifiedReward-7B](https://huggingface.co/CodeGoat24/UnifiedReward-7b) for enhanced video understanding ability.
12
 
13
  For further details, please refer to the following resources:
14
- - πŸ“° Paper:
15
  - πŸͺ Project Page: https://codegoat24.github.io/UnifiedReward/
16
  - πŸ€— Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
17
  - πŸ€— Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
@@ -87,5 +87,10 @@ print(text_outputs)
87
  ## Citation
88
 
89
  ```
90
-
 
 
 
 
 
91
  ```
 
11
  This model is trained on LLaVA-Video-7B-Qwen2 based on DPO preference data constructed by our [UnifiedReward-7B](https://huggingface.co/CodeGoat24/UnifiedReward-7b) for enhanced video understanding ability.
12
 
13
  For further details, please refer to the following resources:
14
+ - πŸ“° Paper: https://arxiv.org/pdf/2503.05236
15
  - πŸͺ Project Page: https://codegoat24.github.io/UnifiedReward/
16
  - πŸ€— Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
17
  - πŸ€— Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
 
87
  ## Citation
88
 
89
  ```
90
+ @article{UnifiedReward,
91
+ title={Unified Reward Model for Multimodal Understanding and Generation.},
92
+ author={Wang, Yibin and Zang, Yuhang, and Li, Hao and Jin, Cheng and Wang Jiaqi},
93
+ journal={arXiv preprint arXiv:2503.05236},
94
+ year={2025}
95
+ }
96
  ```