V-STaR commited on
Commit
0df35f9
·
verified ·
1 Parent(s): ff9e632

Update constants.py

Browse files
Files changed (1) hide show
  1. constants.py +1 -0
constants.py CHANGED
@@ -189,6 +189,7 @@ LEADERBORAD_INTRODUCTION = """# V-STaR Leaderboard
189
 
190
  - **Comprehensive Dimensions:** We evaluate Video-LLM’s spatio-temporal reasoning ability in answering questions explicitly in the context of “when”, “where”, and “what”.
191
  - **Human Alignment:** We conducted extensive experiments and human annotations to validate robustness of V-STaR.
 
192
  - **Valuable Insights:** V-STaR reveals a fundamental weakness in existing Video-LLMs regarding causal spatio-temporal reasoning.
193
 
194
  **Join Leaderboard**: Please contact us to update your results.
 
189
 
190
  - **Comprehensive Dimensions:** We evaluate Video-LLM’s spatio-temporal reasoning ability in answering questions explicitly in the context of “when”, “where”, and “what”.
191
  - **Human Alignment:** We conducted extensive experiments and human annotations to validate robustness of V-STaR.
192
+ - **New Metrics:** We proposed to use Arithmetic Mean (AM) and modified logarithmic Geometric Mean (LGM) to measure the spatio-temporal reasoning capability of Video-LLMs. We calculate AM and LGM from the "Accuracy" of VQA, "m_tIoU" of Temporal grounding and "m_vIoU" of Spatial Grounding, and we get the mean AM (mAM) and mean LGM (mLGM) from the results of our proposed 2 RSTR question chains.
193
  - **Valuable Insights:** V-STaR reveals a fundamental weakness in existing Video-LLMs regarding causal spatio-temporal reasoning.
194
 
195
  **Join Leaderboard**: Please contact us to update your results.