Xiaomabufei commited on
Commit
ac43c9f
·
verified ·
1 Parent(s): e0b11bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -37,14 +37,13 @@ We further demonstrate the superiority of I2I priors over T2I priors on some tex
37
 
38
  Source code is available at https://github.com/xiaomabufei/lumos.
39
 
40
- ### Model Description
41
 
42
  - **Developed by:** Lumos
43
  - **Model type:** Diffusion-Transformer-based generative model
44
  - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
45
- - **Model Description:** **Lumos-I2I** Lumos-I2I is a model designed for generating images based on image prompts. It utilizes a [Transformer Latent Diffusion architecture](https://arxiv.org/abs/2310.00426) and incorporates a fixed, pretrained vision encoder ([DINO](
46
  https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth)). **Lumos-T2I** is a model that can be used to generate images based on text prompts.
47
  It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained text encoders ([T5](
48
  https://huggingface.co/DeepFloyd/t5-v1_1-xxl)).
49
- - **Resources for more information:** Check out our [GitHub Repository](https://github.com/xiaomabufei/lumos) and the [Lumos report on arXiv](https://arxiv.org/pdf/2412.07767).
50
-
 
37
 
38
  Source code is available at https://github.com/xiaomabufei/lumos.
39
 
40
+ ## 📋 Model Description
41
 
42
  - **Developed by:** Lumos
43
  - **Model type:** Diffusion-Transformer-based generative model
44
  - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
45
+ - **Model Description:** **Lumos-I2I** is a model designed for generating images based on image prompts. It utilizes a [Transformer Latent Diffusion architecture](https://arxiv.org/abs/2310.00426) and incorporates a fixed, pretrained vision encoder ([DINO](
46
  https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth)). **Lumos-T2I** is a model that can be used to generate images based on text prompts.
47
  It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained text encoders ([T5](
48
  https://huggingface.co/DeepFloyd/t5-v1_1-xxl)).
49
+ - **Resources for more information:** Check out our [GitHub Repository](https://github.com/xiaomabufei/lumos) and the [Lumos report on arXiv](https://arxiv.org/pdf/2412.07767).