File size: 1,756 Bytes
fa4294e
d6a9950
69cca49
fa4294e
8f0ea9c
69cca49
 
 
 
8f0ea9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d6a9950
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: apache-2.0
pipeline_tag: any-to-any
---

This repository contains the models of the paper [Generalized Decoding for Pixel, Image, and Language](https://huggingface.co/papers/2212.11270).

Github: https://github.com/microsoft/X-Decoder

***Click to Download!***

## -> Models

*Focal-T:* <br/>
[xdecoder_focalt_last_novg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_last_novg.pt) <br/>
[xdecoder_focalt_last.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_last.pt) <br/>
[xdecoder_focalt_best_openseg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_best_openseg.pt) <br/>

*Focal-L:* <br/>
[xdecoder_focall_last.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focall_last.pt) <br/>
[xdecoder_focall_bestseg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focall_bestseg.pt) <br/>

## -> Datasets

[caption_class_similarity.pth](https://huggingface.co/xdecoder/X-Decoder/resolve/main/caption_class_similarity.pth) <br/>
[captions_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/captions_train2017_filtrefgumdval_filtvlp.json) <br/>
[grounding_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/grounding_train2017_filtrefgumdval_filtvlp.json) <br/>
[panoptic_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/panoptic_train2017_filtrefgumdval_filtvlp.json) <br/>
[refcocog_umd_val.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/refcocog_umd_val.json) <br/>

## -> Evaluations

[coco_caption.zip](https://huggingface.co/xdecoder/X-Decoder/resolve/main/coco_caption.zip) <br/>