Feature Extraction
machuofan commited on
Commit
cba5c1e
·
verified ·
1 Parent(s): 6125d7f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -7,12 +7,11 @@ pipeline_tag: feature-extraction
7
 
8
  This repository contains UniTok, a unified visual tokenizer for both image generation and understanding tasks, as presented in [UniTok: A Unified Tokenizer for Visual Generation and Understanding](https://hf.co/papers/2502.20321).
9
 
10
- Project Page: https://foundationvision.github.io/UniTok/
11
-
12
  Code: https://github.com/FoundationVision/UniTok
13
 
14
  <p align="center">
15
- <img src="https://github.com/FoundationVision/UniTok/blob/main/assets/teaser.png" width=93%>
16
  <p>
17
 
18
  UniTok encodes fine-grained details for generation and captures high-level semantics for understanding. It's compatible with autoregressive generative models (e.g., LlamaGen), multimodal understanding models (e.g., LLaVA), and unified MLLMs (e.g., Chameleon and Liquid).
@@ -20,7 +19,7 @@ UniTok encodes fine-grained details for generation and captures high-level seman
20
  Built upon UniTok, we construct an MLLM capable of both multimodal generation and understanding, which sets a new state-of-the-art among unified autoregressive MLLMs. The weights of our MLLM will be released soon.
21
 
22
  <p align="center">
23
- <img src="https://github.com/FoundationVision/UniTok/blob/main/assets/samples.png" width=93%>
24
  <p>
25
 
26
  ## Performance
 
7
 
8
  This repository contains UniTok, a unified visual tokenizer for both image generation and understanding tasks, as presented in [UniTok: A Unified Tokenizer for Visual Generation and Understanding](https://hf.co/papers/2502.20321).
9
 
10
+ Project Page: https://foundationvision.github.io/UniTok/ <br>
 
11
  Code: https://github.com/FoundationVision/UniTok
12
 
13
  <p align="center">
14
+ <img src="https://github.com/FoundationVision/UniTok/blob/main/assets/teaser.png?raw=true" width=93%>
15
  <p>
16
 
17
  UniTok encodes fine-grained details for generation and captures high-level semantics for understanding. It's compatible with autoregressive generative models (e.g., LlamaGen), multimodal understanding models (e.g., LLaVA), and unified MLLMs (e.g., Chameleon and Liquid).
 
19
  Built upon UniTok, we construct an MLLM capable of both multimodal generation and understanding, which sets a new state-of-the-art among unified autoregressive MLLMs. The weights of our MLLM will be released soon.
20
 
21
  <p align="center">
22
+ <img src="https://github.com/FoundationVision/UniTok/blob/main/assets/samples.png?raw=true" width=93%>
23
  <p>
24
 
25
  ## Performance