Text Generation
Transformers
Safetensors
English
omnilmm
conversational
Inference Endpoints
HaoyeZhang commited on
Commit
0f6f44a
·
verified ·
1 Parent(s): a006514

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -5
README.md CHANGED
@@ -16,7 +16,7 @@ paper:
16
  We utilize a novel framework, [RLAIF-V](https://github.com/RLHF-V/RLAIF-V), which **aligns MLLMs in a fully open-source paradigm**. This framework maximally exploits the [open-source feedback](https://huggingface.co/datasets/HaoyeZhang/RLAIF-V-Dataset) from two key perspectives, including **high-quality feedback data** and an **online feedback learning algorithm**.
17
 
18
  <p align="center">
19
- <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/T4hALrgNdXKHnkvb-27bA.png" alt="fig1" width="85%"/>
20
  </p>
21
 
22
  ## Model Details
@@ -27,9 +27,15 @@ We utilize a novel framework, [RLAIF-V](https://github.com/RLHF-V/RLAIF-V), whic
27
  * 💪 **Maintaining Well Performance on General Abilities**: On benchmarks tested with the general abilities (e.g. LLaVA Bench, MMStar), RLAIF-V-12B also exhibits good performance.
28
 
29
  <p align="center">
30
- <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/ypXZxb4HE-jDPJU9115bi.png" alt="fig1" width="90%"/>
31
  </p>
32
- <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/ypXZxb4HE-jDPJU9115bi.png) -->
 
 
 
 
 
 
33
 
34
  ### Examples
35
  <p align="center">
@@ -59,8 +65,8 @@ If you find our model/code/paper helpful, please consider cite our papers 📝:
59
  }
60
 
61
  @article{yu2024rlaifv,
62
- title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness},
63
- author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},
64
  journal={arXiv preprint arXiv:2405.17220},
65
  year={2024},
66
  }
 
16
  We utilize a novel framework, [RLAIF-V](https://github.com/RLHF-V/RLAIF-V), which **aligns MLLMs in a fully open-source paradigm**. This framework maximally exploits the [open-source feedback](https://huggingface.co/datasets/HaoyeZhang/RLAIF-V-Dataset) from two key perspectives, including **high-quality feedback data** and an **online feedback learning algorithm**.
17
 
18
  <p align="center">
19
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/T4hALrgNdXKHnkvb-27bA.png" alt="fig1-1" width="85%"/>
20
  </p>
21
 
22
  ## Model Details
 
27
  * 💪 **Maintaining Well Performance on General Abilities**: On benchmarks tested with the general abilities (e.g. LLaVA Bench, MMStar), RLAIF-V-12B also exhibits good performance.
28
 
29
  <p align="center">
30
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/dhsi5_okbtlBp2pfYOkFK.png" alt="fig1-2" width="90%"/>
31
  </p>
32
+
33
+ * 🚀 **Inference-time Scaling by RLAIF-V Reward**: Using RLAIF-V 12B as a reward model can further improve model performance on multiple benchmarks with best-of-N selection. It also consistently improves the trustworthiness on different MLLMs.
34
+
35
+ <p align="center">
36
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/QB_plzz-wRmyDcr81BXum.png" alt="fig1-3" width="50%"/>
37
+ </p>
38
+
39
 
40
  ### Examples
41
  <p align="center">
 
65
  }
66
 
67
  @article{yu2024rlaifv,
68
+ title={RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness},
69
+ author={Tianyu Yu and Haoye Zhang and Qiming Li and Qixin Xu and Yuan Yao and Da Chen and Xiaoman Lu and Ganqu Cui and Yunkai Dang and Taiwen He and Xiaocheng Feng and Jun Song and Bo Zheng and Zhiyuan Liu and Tat-Seng Chua and Maosong Sun},
70
  journal={arXiv preprint arXiv:2405.17220},
71
  year={2024},
72
  }