File size: 1,056 Bytes
c91a24f
 
 
 
 
 
 
 
 
 
 
 
c48f218
c91a24f
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: apache-2.0
base_model:
- Chat-UniVi/Chat-UniVi
pipeline_tag: image-segmentation
---

<div align="center">
<br>
<h3>The Devil is in Temporal Token: High Quality Video Reasoning Segmentation</h3>

[Sitong Gong](https://github.com/SitongGong) <sup>1</sup>&nbsp;
[Yunzhi Zhuge](https://scholar.google.com.hk/citations?hl=zh-CN&user=-37EfvgAAAAJ) <sup>1</sup>&nbsp;
[Lu Zhang](https://scholar.google.com.hk/citations?hl=zh-CN&user=bUtRE5UAAAAJ) <sup>1</sup>&nbsp;
[Zongxin Yang](https://scholar.google.com.hk/citations?user=8IE0CfwAAAAJ&hl=zh-CN&oi=ao) <sup>2</sup>&nbsp;
[Pingping Zhang](https://scholar.google.com.hk/citations?hl=zh-CN&user=MfbIbuEAAAAJ) <sup>1</sup>&nbsp;
[Huchuan Lu](https://scholar.google.com.hk/citations?user=D3nE0agAAAAJ&hl=zh-CN) <sup>1</sup>&nbsp;

CVPR 2025

<sup>1</sup> Dalian University of Technology &nbsp; <sup>2</sup> Havard University&nbsp;
 
[![arXiv](https://img.shields.io/badge/arXiv-<2501.08549>-<COLOR>.svg)](https://arxiv.org/pdf/2501.08549)

You can find the code at: https://github.com/SitongGong/VRS-HQ