0nejiawei commited on
Commit
df460d0
·
1 Parent(s): f548402

update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -38,7 +38,7 @@ We evaluate Tarsier2-Recap-7b on DREAM-1K, a detailed video description benchmar
38
  _Note: The results of Tarsier2-Recap-7b is different from the results we reported in Table 11 in the [Tarsier2 technical report](https://arxiv.org/abs/2501.07888), as Tarsier2-Recap-7b is more fully trained (2 epochs vs 1 epoch)._
39
 
40
  ### Video Question-Answering
41
- We evalute Tarsier2-Recap-7b on TVBench. As Tarsier2-Recap-7b is only trained with video caption data, it needs some additional prompt to enduce it to conduct multi-choice question-answering tasks, see [TVBench](https://github.com/bytedance/tarsier/blob/tarsier2/data/annotations/TVBench.jsonl) samples as an example. Here is the evaluation result:
42
 
43
  | Task | Tarsier2-Recap-7b | Tarsier2-7b |
44
  | ------- | :--------: | :-------: |
 
38
  _Note: The results of Tarsier2-Recap-7b is different from the results we reported in Table 11 in the [Tarsier2 technical report](https://arxiv.org/abs/2501.07888), as Tarsier2-Recap-7b is more fully trained (2 epochs vs 1 epoch)._
39
 
40
  ### Video Question-Answering
41
+ We evalute Tarsier2-Recap-7b on [TVBench](https://paperswithcode.com/sota/video-question-answering-on-tvbench), a novel multiple-choice question-answering which requires a high level of temporal understanding. As Tarsier2-Recap-7b is only trained with video caption data, it needs some additional prompt to enduce it to conduct multi-choice question-answering tasks, see [TVBench](https://github.com/bytedance/tarsier/blob/tarsier2/data/annotations/TVBench.jsonl) samples as an example. Here is the evaluation result:
42
 
43
  | Task | Tarsier2-Recap-7b | Tarsier2-7b |
44
  | ------- | :--------: | :-------: |