File size: 1,342 Bytes
c094972
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
license: cc-by-4.0
---
# ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video(ECCV2024)

This repo is the official model checkpoints of ["ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video"](https://arxiv.org/abs/2310.01324)(ECCV2024)


## Models

We provide the checkpoints before reparameterization, you could reparameter the weight refer to `tools\weight_reparam.py` in our [codes](https://github.com/MCG-NJU/ZeroI2V/blob/main/tools/weight_reparam.py).
### Kinetics 400

| Backbone |  Pretrain   | GFLOPs | Param | New Param (M) | acc@1 | Views 
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | 
| ViT-B/16 | CLIP | 422 | 86 | 0 | 83.0 | 8x1x3 |
| ViT-L/14 | CLIP | 1946 | 304 | 0 | 86.3 | 8x1x3 | 
| ViT-L/14 | CLIP | 7783 | 304 | 0 | 87.2 | 32x1x3 |

### Something Something V2

| Backbone |  Pretrain   | GFLOPs | Param | New Param (M) | acc@1 | Views |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | 
| ViT-L/14 | CLIP | 7783 | 304 | 0 | 72.2 | 32x3x1 |



If you find our work useful in your research, please cite:
```
@article{li2023zeroi2v,
  title={ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video},
  author={Li, Xinhao and Zhu, Yuhan and Wang, Limin},
  journal={arXiv preprint arXiv:2310.01324},
  year={2023}
}
```