Added initial commit
Browse files
README.md
CHANGED
@@ -1,3 +1,233 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
---
|
6 |
+
|
7 |
+
# Model Details
|
8 |
+
Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs more efficient by removing redundant information in the processed tokens. While different methods have been explored to achieve this goal, we still lack understanding of the resulting reduction patterns and how those patterns differ across token reduction methods and datasets. To close this gap, we set out to understand the reduction patterns of 10 different token reduction methods using four image classification datasets: ImageNet, NABirds, COCO, and NUS-WIDE.
|
9 |
+
|
10 |
+
We provide DeiT checkpoints (Tiny, Small, and Base) at four keep rates (0.9, 0.7, 0.5, and 0.25) for four classification datasets: ImageNet-1K, NABirds, COCO 2014, and NUS-WIDE.
|
11 |
+
|
12 |
+
### Model Description
|
13 |
+
- **Developed by:** Joakim Bruslund Haurum, Sergio Escalera, Graham W. Taylor, and Thomas B. Moeslund
|
14 |
+
- **Model type:** Vision Transformers
|
15 |
+
- **License:** MIT License
|
16 |
+
|
17 |
+
### More Resources
|
18 |
+
- **Repository:** [https://github.com/JoakimHaurum/TokenReduction](https://github.com/JoakimHaurum/TokenReduction)
|
19 |
+
- **Paper:** [https://arxiv.org/abs/2308.04657](https://arxiv.org/abs/2308.04657)
|
20 |
+
- **Project Page:** [https://vap.aau.dk/tokens/](https://vap.aau.dk/tokens/)
|
21 |
+
- **HuggingFace Collection:** [https://huggingface.co/collections/joakimbh/which-tokens-to-use-66e94fc5e4f545d575dfcce1](https://huggingface.co/collections/joakimbh/which-tokens-to-use-66e94fc5e4f545d575dfcce1)
|
22 |
+
|
23 |
+
|
24 |
+
## Model Zoo
|
25 |
+
**Note: This repository does not host any checkpoints but contains links to all the model repositories. Each token reduction method repository contains the checkpoints for the four considered keep rates.**
|
26 |
+
|
27 |
+
### Top-K:
|
28 |
+
| Model Name | Dataset | Weights |
|
29 |
+
|:-----------|:--------:|:-----------:|
|
30 |
+
| topk_base-im1k | ImageNet-1K | [link]() |
|
31 |
+
| topk_small-im1k | ImageNet-1K | [link]() |
|
32 |
+
| topk_tiny-im1k | ImageNet-1K | [link]() |
|
33 |
+
| topk_base-nab | NABirds | [link]() |
|
34 |
+
| topk_small-nab | NABirds | [link]() |
|
35 |
+
| topk_tiny-nab | NABirds | [link]() |
|
36 |
+
| topk_base-coco | COCO 2014 | [link]() |
|
37 |
+
| topk_small-coco | COCO 2014 | [link]() |
|
38 |
+
| topk_tiny-coco | COCO 2014 | [link]() |
|
39 |
+
| topk_base-nus | NUS-WIDE | [link]() |
|
40 |
+
| topk_small-nus | NUS-WIDE | [link]() |
|
41 |
+
| topk_tiny-nus | NUS-WIDE | [link]() |
|
42 |
+
|
43 |
+
### EViT
|
44 |
+
| Model Name | Dataset | Weights |
|
45 |
+
|:-----------|:--------:|:-----------:|
|
46 |
+
| evit_base-im1k | ImageNet-1K | [link]() |
|
47 |
+
| evit_small-im1k | ImageNet-1K | [link]() |
|
48 |
+
| evit_tiny-im1k | ImageNet-1K | [link]() |
|
49 |
+
| evit_base-nab | NABirds | [link]() |
|
50 |
+
| evit_small-nab | NABirds | [link]() |
|
51 |
+
| evit_tiny-nab | NABirds | [link]() |
|
52 |
+
| evit_base-coco | COCO 2014 | [link]() |
|
53 |
+
| evit_small-coco | COCO 2014 | [link]() |
|
54 |
+
| evit_tiny-coco | COCO 2014 | [link]() |
|
55 |
+
| evit_base-nus | NUS-WIDE | [link]() |
|
56 |
+
| evit_small-nus | NUS-WIDE | [link]() |
|
57 |
+
| evit_tiny-nus | NUS-WIDE | [link]() |
|
58 |
+
|
59 |
+
### DynamicViT
|
60 |
+
| Model Name | Dataset | Weights |
|
61 |
+
|:-----------|:--------:|:-----------:|
|
62 |
+
| dyvit_base-im1k | ImageNet-1K | [link]() |
|
63 |
+
| dyvit_small-im1k | ImageNet-1K | [link]() |
|
64 |
+
| dyvit_tiny-im1k | ImageNet-1K | [link]() |
|
65 |
+
| dyvit_base-nab | NABirds | [link]() |
|
66 |
+
| dyvit_small-nab | NABirds | [link]() |
|
67 |
+
| dyvit_tiny-nab | NABirds | [link]() |
|
68 |
+
| dyvit_base-coco | COCO 2014 | [link]() |
|
69 |
+
| dyvit_small-coco | COCO 2014 | [link]() |
|
70 |
+
| dyvit_tiny-coco | COCO 2014 | [link]() |
|
71 |
+
| dyvit_base-nus | NUS-WIDE | [link]() |
|
72 |
+
| dyvit_small-nus | NUS-WIDE | [link]() |
|
73 |
+
| dyvit_tiny-nus | NUS-WIDE | [link]() |
|
74 |
+
|
75 |
+
### ATS
|
76 |
+
| Model Name | Dataset | Weights |
|
77 |
+
|:-----------|:--------:|:-----------:|
|
78 |
+
| ats_base-im1k | ImageNet-1K | [link]() |
|
79 |
+
| ats_small-im1k | ImageNet-1K | [link]() |
|
80 |
+
| ats_tiny-im1k | ImageNet-1K | [link]() |
|
81 |
+
| ats_base-nab | NABirds | [link]() |
|
82 |
+
| ats_small-nab | NABirds | [link]() |
|
83 |
+
| ats_tiny-nab | NABirds | [link]() |
|
84 |
+
| ats_base-coco | COCO 2014 | [link]() |
|
85 |
+
| ats_small-coco | COCO 2014 | [link]() |
|
86 |
+
| ats_tiny-coco | COCO 2014 | [link]() |
|
87 |
+
| ats_base-nus | NUS-WIDE | [link]() |
|
88 |
+
| ats_small-nus | NUS-WIDE | [link]() |
|
89 |
+
| ats_tiny-nus | NUS-WIDE | [link]() |
|
90 |
+
|
91 |
+
### L1
|
92 |
+
| Model Name | Dataset | Weights |
|
93 |
+
|:-----------|:--------:|:-----------:|
|
94 |
+
| l1_base-im1k | ImageNet-1K | [link]() |
|
95 |
+
| l1_small-im1k | ImageNet-1K | [link]() |
|
96 |
+
| l1_tiny-im1k | ImageNet-1K | [link]() |
|
97 |
+
| l1_base-nab | NABirds | [link]() |
|
98 |
+
| l1_small-nab | NABirds | [link]() |
|
99 |
+
| l1_tiny-nab | NABirds | [link]() |
|
100 |
+
| l1_base-coco | COCO 2014 | [link]() |
|
101 |
+
| l1_small-coco | COCO 2014 | [link]() |
|
102 |
+
| l1_tiny-coco | COCO 2014 | [link]() |
|
103 |
+
| l1_base-nus | NUS-WIDE | [link]() |
|
104 |
+
| l1_small-nus | NUS-WIDE | [link]() |
|
105 |
+
| l1_tiny-nus | NUS-WIDE | [link]() |
|
106 |
+
|
107 |
+
### L2
|
108 |
+
| Model Name | Dataset | Weights |
|
109 |
+
|:-----------|:--------:|:-----------:|
|
110 |
+
| l2_base-im1k | ImageNet-1K | [link]() |
|
111 |
+
| l2_small-im1k | ImageNet-1K | [link]() |
|
112 |
+
| l2_tiny-im1k | ImageNet-1K | [link]() |
|
113 |
+
| l2_base-nab | NABirds | [link]() |
|
114 |
+
| l2_small-nab | NABirds | [link]() |
|
115 |
+
| l2_tiny-nab | NABirds | [link]() |
|
116 |
+
| l2_base-coco | COCO 2014 | [link]() |
|
117 |
+
| l2_small-coco | COCO 2014 | [link]() |
|
118 |
+
| l2_tiny-coco | COCO 2014 | [link]() |
|
119 |
+
| l2_base-nus | NUS-WIDE | [link]() |
|
120 |
+
| l2_small-nus | NUS-WIDE | [link]() |
|
121 |
+
| l2_tiny-nus | NUS-WIDE | [link]() |
|
122 |
+
|
123 |
+
### L-Infinity
|
124 |
+
| Model Name | Dataset | Weights |
|
125 |
+
|:-----------|:--------:|:-----------:|
|
126 |
+
| linf_base-im1k | ImageNet-1K | [link]() |
|
127 |
+
| linf_small-im1k | ImageNet-1K | [link]() |
|
128 |
+
| linf_tiny-im1k | ImageNet-1K | [link]() |
|
129 |
+
| linf_base-nab | NABirds | [link]() |
|
130 |
+
| linf_small-nab | NABirds | [link]() |
|
131 |
+
| linf_tiny-nab | NABirds | [link]() |
|
132 |
+
| linf_base-coco | COCO 2014 | [link]() |
|
133 |
+
| linf_small-coco | COCO 2014 | [link]() |
|
134 |
+
| linf_tiny-coco | COCO 2014 | [link]() |
|
135 |
+
| linf_base-nus | NUS-WIDE | [link]() |
|
136 |
+
| linf_small-nus | NUS-WIDE | [link]() |
|
137 |
+
| linf_tiny-nus | NUS-WIDE | [link]() |
|
138 |
+
|
139 |
+
### ToMe
|
140 |
+
| Model Name | Dataset | Weights |
|
141 |
+
|:-----------|:--------:|:-----------:|
|
142 |
+
| tome_base-im1k | ImageNet-1K | [link]() |
|
143 |
+
| tome_small-im1k | ImageNet-1K | [link]() |
|
144 |
+
| tome_tiny-im1k | ImageNet-1K | [link]() |
|
145 |
+
| tome_base-nab | NABirds | [link]() |
|
146 |
+
| tome_small-nab | NABirds | [link]() |
|
147 |
+
| tome_tiny-nab | NABirds | [link]() |
|
148 |
+
| tome_base-coco | COCO 2014 | [link]() |
|
149 |
+
| tome_small-coco | COCO 2014 | [link]() |
|
150 |
+
| tome_tiny-coco | COCO 2014 | [link]() |
|
151 |
+
| tome_base-nus | NUS-WIDE | [link]() |
|
152 |
+
| tome_small-nus | NUS-WIDE | [link]() |
|
153 |
+
| tome_tiny-nus | NUS-WIDE | [link]() |
|
154 |
+
|
155 |
+
### K-Medoids
|
156 |
+
| Model Name | Dataset | Weights |
|
157 |
+
|:-----------|:--------:|:-----------:|
|
158 |
+
| kmedoids_base-im1k | ImageNet-1K | [link]() |
|
159 |
+
| kmedoids-small_im1k | ImageNet-1K | [link]() |
|
160 |
+
| kmedoids_tiny-im1k | ImageNet-1K | [link]() |
|
161 |
+
| kmedoids_base-nab | NABirds | [link]() |
|
162 |
+
| kmedoids-small_nab | NABirds | [link]() |
|
163 |
+
| kmedoids_tiny-nab | NABirds | [link]() |
|
164 |
+
| kmedoids_base-coco | COCO 2014 | [link]() |
|
165 |
+
| kmedoids-small_coco | COCO 2014 | [link]() |
|
166 |
+
| kmedoids_tiny-coco | COCO 2014 | [link]() |
|
167 |
+
| kmedoids_base-nus | NUS-WIDE | [link]() |
|
168 |
+
| kmedoids-small_nus | NUS-WIDE | [link]() |
|
169 |
+
| kmedoids_tiny-nus | NUS-WIDE | [link]() |
|
170 |
+
|
171 |
+
### DPC-KNN
|
172 |
+
| Model Name | Dataset | Weights |
|
173 |
+
|:-----------|:--------:|:-----------:|
|
174 |
+
| dpcknn_base-im1k | ImageNet-1K | [link]() |
|
175 |
+
| dpcknn_small_im1k | ImageNet-1K | [link]() |
|
176 |
+
| dpcknn_tiny-im1k | ImageNet-1K | [link]() |
|
177 |
+
| dpcknn_base-nab | NABirds | [link]() |
|
178 |
+
| dpcknn_small_nab | NABirds | [link]() |
|
179 |
+
| dpcknn_tiny-nab | NABirds | [link]() |
|
180 |
+
| dpcknn_base-coco | COCO 2014 | [link]() |
|
181 |
+
| dpcknn_small_coco | COCO 2014 | [link]() |
|
182 |
+
| dpcknn_tiny-coco | COCO 2014 | [link]() |
|
183 |
+
| dpcknn_base-nus | NUS-WIDE | [link]() |
|
184 |
+
| dpcknn_small_nus | NUS-WIDE | [link]() |
|
185 |
+
| dpcknn_tiny-nus | NUS-WIDE | [link]() |
|
186 |
+
|
187 |
+
### SiT
|
188 |
+
| Model Name | Dataset | Weights |
|
189 |
+
|:-----------|:--------:|:-----------:|
|
190 |
+
| sit_base-im1k | ImageNet-1K | [link]() |
|
191 |
+
| sit_small_im1k | ImageNet-1K | [link]() |
|
192 |
+
| sit_tiny-im1k | ImageNet-1K | [link]() |
|
193 |
+
| sit_base-nab | NABirds | [link]() |
|
194 |
+
| sit_small_nab | NABirds | [link]() |
|
195 |
+
| sit_tiny-nab | NABirds | [link]() |
|
196 |
+
| sit_base-coco | COCO 2014 | [link]() |
|
197 |
+
| sit_small_coco | COCO 2014 | [link]() |
|
198 |
+
| sit_tiny-coco | COCO 2014 | [link]() |
|
199 |
+
| sit_base-nus | NUS-WIDE | [link]() |
|
200 |
+
| sit_small_nus | NUS-WIDE | [link]() |
|
201 |
+
| sit_tiny-nus | NUS-WIDE | [link]() |
|
202 |
+
|
203 |
+
### PatchMerger
|
204 |
+
| Model Name | Dataset | Weights |
|
205 |
+
|:-----------|:--------:|:-----------:|
|
206 |
+
| patchmerger_base-im1k | ImageNet-1K | [link]() |
|
207 |
+
| patchmerger_small_im1k | ImageNet-1K | [link]() |
|
208 |
+
| patchmerger_tiny-im1k | ImageNet-1K | [link]() |
|
209 |
+
| patchmerger_base-nab | NABirds | [link]() |
|
210 |
+
| patchmerger_small_nab | NABirds | [link]() |
|
211 |
+
| patchmerger_tiny-nab | NABirds | [link]() |
|
212 |
+
| patchmerger_base-coco | COCO 2014 | [link]() |
|
213 |
+
| patchmerger_small_coco | COCO 2014 | [link]() |
|
214 |
+
| patchmerger_tiny-coco | COCO 2014 | [link]() |
|
215 |
+
| patchmerger_base-nus | NUS-WIDE | [link]() |
|
216 |
+
| patchmerger_small_nus | NUS-WIDE | [link]() |
|
217 |
+
| patchmerger_tiny-nus | NUS-WIDE | [link]() |
|
218 |
+
|
219 |
+
### Sinkhorn
|
220 |
+
| Model Name | Dataset | Weights |
|
221 |
+
|:-----------|:--------:|:-----------:|
|
222 |
+
| sinkhorn_base-im1k | ImageNet-1K | [link]() |
|
223 |
+
| sinkhorn_small_im1k | ImageNet-1K | [link]() |
|
224 |
+
| sinkhorn_tiny-im1k | ImageNet-1K | [link]() |
|
225 |
+
| sinkhorn_base-nab | NABirds | [link]() |
|
226 |
+
| sinkhorn_small_nab | NABirds | [link]() |
|
227 |
+
| sinkhorn_tiny-nab | NABirds | [link]() |
|
228 |
+
| sinkhorn_base-coco | COCO 2014 | [link]() |
|
229 |
+
| sinkhorn_small_coco | COCO 2014 | [link]() |
|
230 |
+
| sinkhorn_tiny-coco | COCO 2014 | [link]() |
|
231 |
+
| sinkhorn_base-nus | NUS-WIDE | [link]() |
|
232 |
+
| sinkhorn_small_nus | NUS-WIDE | [link]() |
|
233 |
+
| sinkhorn_tiny-nus | NUS-WIDE | [link]() |
|