File size: 2,290 Bytes
17a3da3 b1b188a 3ba8654 f0c9ee0 7947d8b ab92717 f0c9ee0 ab92717 f0c9ee0 ab92717 b1b188a bbb3390 f0c9ee0 ab92717 f0c9ee0 bbb3390 b1b188a bbb3390 ab92717 7947d8b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
---
title: README
emoji: π
colorFrom: green
colorTo: pink
sdk: static
pinned: false
---
# MJ-Bench Team: Align
## π [**MJ-Video**: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation](https://aiming-lab.github.io/MJ-VIDEO.github.io/)
We release MJ-Bench-Video, a comprehensive fine-grained video preference benchmark, and MJ-Video, a powerful MoE-based multi-dimensional video reward model!
## π©ββοΈ [**MJ-Bench**: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?](https://mj-bench.github.io/)
Project page: https://mj-bench.github.io/
Code repository: https://github.com/MJ-Bench/MJ-Bench
While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial to align these models with desired behaviors based on feedback from a multimodal judge. Despite their significance, current multimodal judges frequently undergo inadequate evaluation of their capabilities and limitations, potentially leading to misalignment and unsafe fine-tuning outcomes.
To address this issue, we introduce MJ-Bench, a novel benchmark which incorporates a comprehensive preference dataset to evaluate multimodal judges in providing feedback for image generation models across four key perspectives: **alignment**, **safety**, **image quality**, and **bias**.
<!--  -->
Specifically, we evaluate a large variety of multimodal judges including
- 6 smaller-sized CLIP-based scoring models
- 11 open-source VLMs (e.g. LLaVA family)
- 4 and close-source VLMs (e.g. GPT-4o, Claude 3)
-
<!--  -->
π₯π₯We are actively updating the [leaderboard](https://mj-bench.github.io/) and you are welcome to submit the evaluation result of your multimodal judge on [our dataset](https://huggingface.co/datasets/MJ-Bench/MJ-Bench) to [huggingface leaderboard](https://huggingface.co/spaces/MJ-Bench/MJ-Bench-Leaderboard).
|