update README.md
Browse files
README.md
CHANGED
@@ -38,7 +38,7 @@ We evaluate Tarsier2-Recap-7b on DREAM-1K, a detailed video description benchmar
|
|
38 |
_Note: The results of Tarsier2-Recap-7b is different from the results we reported in Table 11 in the [Tarsier2 technical report](https://arxiv.org/abs/2501.07888), as Tarsier2-Recap-7b is more fully trained (2 epochs vs 1 epoch)._
|
39 |
|
40 |
### Video Question-Answering
|
41 |
-
We evalute Tarsier2-Recap-7b on TVBench. As Tarsier2-Recap-7b is only trained with video caption data, it needs some additional prompt to enduce it to conduct multi-choice question-answering tasks, see [TVBench](https://github.com/bytedance/tarsier/blob/tarsier2/data/annotations/TVBench.jsonl) samples as an example. Here is the evaluation result:
|
42 |
|
43 |
| Task | Tarsier2-Recap-7b | Tarsier2-7b |
|
44 |
| ------- | :--------: | :-------: |
|
|
|
38 |
_Note: The results of Tarsier2-Recap-7b is different from the results we reported in Table 11 in the [Tarsier2 technical report](https://arxiv.org/abs/2501.07888), as Tarsier2-Recap-7b is more fully trained (2 epochs vs 1 epoch)._
|
39 |
|
40 |
### Video Question-Answering
|
41 |
+
We evalute Tarsier2-Recap-7b on [TVBench](https://paperswithcode.com/sota/video-question-answering-on-tvbench), a novel multiple-choice question-answering which requires a high level of temporal understanding. As Tarsier2-Recap-7b is only trained with video caption data, it needs some additional prompt to enduce it to conduct multi-choice question-answering tasks, see [TVBench](https://github.com/bytedance/tarsier/blob/tarsier2/data/annotations/TVBench.jsonl) samples as an example. Here is the evaluation result:
|
42 |
|
43 |
| Task | Tarsier2-Recap-7b | Tarsier2-7b |
|
44 |
| ------- | :--------: | :-------: |
|