hexuan21 commited on
Commit
8642203
·
verified ·
1 Parent(s): 9c8d019

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -11,7 +11,7 @@ pipeline_tag: visual-question-answering
11
  ---
12
 
13
 
14
- [📃Paper](https://arxiv.org/abs/2406.15252) | [🌐Website](https://tiger-ai-lab.github.io/VideoScore/) | [💻Github](https://github.com/TIGER-AI-Lab/VideoScore) | [🛢️Datasets](https://huggingface.co/datasets/TIGER-Lab/VideoFeedback) | [🤗Model (VideoScore)](https://huggingface.co/TIGER-Lab/VideoScore) | [🤗Model (VideoScore-anno-only)](https://huggingface.co/TIGER-Lab/VideoScore-anno-only) | [🤗Model (VideoScore-v1.1)](https://huggingface.co/TIGER-Lab/VideoScore-v1.1)| [🤗Demo](https://huggingface.co/spaces/TIGER-Lab/VideoScore) | [📉Wandb (VideoScore-anno-only)](https://api.wandb.ai/links/xuanhe/4vs5k0cq)
15
 
16
 
17
  ![VideoScore](https://tiger-ai-lab.github.io/VideoScore/static/images/teaser.png)
@@ -20,7 +20,7 @@ pipeline_tag: visual-question-answering
20
  - 🧐🧐[VideoScore-anno-only](https://huggingface.co/TIGER-Lab/VideoScore-anno-only) is a variant from [VideoScore](https://huggingface.co/TIGER-Lab/VideoScore), trained on VideoFeedback dataset
21
  excluding the real videos.
22
 
23
- - [VideoScore](https://huggingface.co/TIGER-Lab/VideoScore) series is a video quality evaluation model series, taking [Mantis-8B-Idefics2](https://huggingface.co/TIGER-Lab/Mantis-8B-Idefics2) as base-model
24
  and trained on [VideoFeedback](https://huggingface.co/datasets/TIGER-Lab/VideoFeedback),
25
  a large video evaluation dataset with multi-aspect human scores.
26
 
@@ -49,8 +49,8 @@ The evaluation results are shown below:
49
 
50
  | metric | Final Avg Score | VideoFeedback-test | EvalCrafter | GenAI-Bench | VBench |
51
  |:-----------------:|:--------------:|:--------------:|:-----------:|:-----------:|:----------:|
52
- | VideoScore (reg) | **69.6** | 75.7 | **51.1** | **78.5** | **73.0** |
53
- | VideoScore (gen) | 55.6 | **77.1** | 27.6 | 59.0 | 58.7 |
54
  | Gemini-1.5-Pro | <u>39.7</u> | 22.1 | 22.9 | 60.9 | 52.9 |
55
  | Gemini-1.5-Flash | 39.4 | 20.8 | 17.3 | <u>67.1</u> | 52.3 |
56
  | GPT-4o | 38.9 | <u>23.1</u> | 28.7 | 52.0 | 51.7 |
 
11
  ---
12
 
13
 
14
+ [📃Paper](https://arxiv.org/abs/2406.15252) | [🌐Website](https://tiger-ai-lab.github.io/VideoScore/) | [💻Github](https://github.com/TIGER-AI-Lab/VideoScore) | [🛢️Datasets](https://huggingface.co/datasets/TIGER-Lab/VideoFeedback) | [🤗Model (VideoScore)](https://huggingface.co/TIGER-Lab/VideoScore) | [🤗Demo](https://huggingface.co/spaces/TIGER-Lab/VideoScore) | [📉Wandb (VideoScore-anno-only)](https://api.wandb.ai/links/xuanhe/4vs5k0cq)
15
 
16
 
17
  ![VideoScore](https://tiger-ai-lab.github.io/VideoScore/static/images/teaser.png)
 
20
  - 🧐🧐[VideoScore-anno-only](https://huggingface.co/TIGER-Lab/VideoScore-anno-only) is a variant from [VideoScore](https://huggingface.co/TIGER-Lab/VideoScore), trained on VideoFeedback dataset
21
  excluding the real videos.
22
 
23
+ - [VideoScore](https://huggingface.co/TIGER-Lab/VideoScore) series is a video quality evaluation model series, taking [Mantis-8B-Idefics2](https://huggingface.co/TIGER-Lab/Mantis-8B-Idefics2) or [Qwen/Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) as base-model,
24
  and trained on [VideoFeedback](https://huggingface.co/datasets/TIGER-Lab/VideoFeedback),
25
  a large video evaluation dataset with multi-aspect human scores.
26
 
 
49
 
50
  | metric | Final Avg Score | VideoFeedback-test | EvalCrafter | GenAI-Bench | VBench |
51
  |:-----------------:|:--------------:|:--------------:|:-----------:|:-----------:|:----------:|
52
+ | VideoScore (reg) | **69.6** | 75.7 | **51.1** | **78.5** | **73.0** |
53
+ | VideoScore (gen) | 55.6 | **77.1** | 27.6 | 59.0 | 58.7 |
54
  | Gemini-1.5-Pro | <u>39.7</u> | 22.1 | 22.9 | 60.9 | 52.9 |
55
  | Gemini-1.5-Flash | 39.4 | 20.8 | 17.3 | <u>67.1</u> | 52.3 |
56
  | GPT-4o | 38.9 | <u>23.1</u> | 28.7 | 52.0 | 51.7 |