fedorajuandy commited on
Commit
0311782
1 Parent(s): 140cc8d

update readme

Browse files
Files changed (1) hide show
  1. README.md +150 -7
README.md CHANGED
@@ -1,13 +1,156 @@
1
  ---
2
- title: Text To Image Generation
3
- emoji: 馃憗
4
- colorFrom: yellow
5
- colorTo: blue
6
  sdk: gradio
7
- sdk_version: 3.41.2
8
  app_file: app.py
9
  pinned: false
10
- license: apache-2.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: ImageGenerator
3
+ emoji: 馃槃
4
+ colorFrom: indigo
5
+ colorTo: gray
6
  sdk: gradio
7
+ sdk_version: 3.17.0
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
+ # Image Generator
13
+
14
+ ## Description
15
+
16
+ ### About
17
+
18
+ A simple application to generate images with the limitation of the algorithm, dataset, hardware specification, chosen limited configuration, and other various variables.
19
+
20
+ ![UI](/additional/interface.png)
21
+
22
+ 1. "Text prompt" textbox to enter text description of desired image
23
+ 2. "Run" button (or keyboard ENTER button) to confirm input
24
+ 3. Result placeholder the place for generated image
25
+
26
+ ### Notes
27
+
28
+ - Optimisation only using Distributed Shampoo
29
+ - The dataset is limited to CelebA-HQ
30
+ - The training is only tracked by epoch
31
+ - Only logs learning rate and loss
32
+ - Encoding, training, and inference can be run in any free plan Colab, Kaggle, and Gradient
33
+
34
+ ---
35
+
36
+ ## Guide
37
+
38
+ ### How to use
39
+
40
+ Users input either free-form text in the textbox or choose one or several attribute options in the form of radio buttons and checkboxes then press the RUN button to confirm them. Then the user waits until the desired image is generated and shown in the previously empty placeholder.
41
+
42
+ ---
43
+
44
+ ## References
45
+
46
+ ### Papers
47
+
48
+ ```text
49
+ @misc{
50
+ title={Zero-Shot Text-to-Image Generation},
51
+ author={Aditya Ramesh and Mikhail Pavlov and Gabriel Goh and Scott Gray and Chelsea Voss and Alec Radford and Mark Chen and Ilya Sutskever},
52
+ year={2021},
53
+ eprint={2102.12092},
54
+ archivePrefix={arXiv},
55
+ primaryClass={cs.CV},
56
+ link={[]()}
57
+ }
58
+ ```
59
+
60
+ ### Datasets
61
+
62
+ ```text
63
+ @inproceedings{CelebAMask-HQ,
64
+ title={MaskGAN: Towards Diverse and Interactive Facial Image Manipulation},
65
+ author={Lee, Cheng-Han and Liu, Ziwei and Wu, Lingyun and Luo, Ping},
66
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
67
+ year={2020}
68
+ }
69
+ ```
70
+
71
+ ```text
72
+ @inproceedings{xia2021tedigan,
73
+ title={TediGAN: Text-Guided Diverse Face Image Generation and Manipulation},
74
+ author={Xia, Weihao and Yang, Yujiu and Xue, Jing-Hao and Wu, Baoyuan},
75
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
76
+ year={2021}
77
+ }
78
+
79
+ @article{xia2021open,
80
+ title={Towards Open-World Text-Guided Face Image Generation and Manipulation},
81
+ author={Xia, Weihao and Yang, Yujiu and Xue, Jing-Hao and Wu, Baoyuan},
82
+ journal={arxiv preprint arxiv: 2104.08910},
83
+ year={2021}
84
+ }
85
+
86
+ @inproceedings{karras2017progressive,
87
+ title={Progressive growing of gans for improved quality, stability, and variation},
88
+ author={Karras, Tero and Aila, Timo and Laine, Samuli and Lehtinen, Jaakko},
89
+ journal={International Conference on Learning Representations (ICLR)},
90
+ year={2018}
91
+ }
92
+
93
+ @inproceedings{liu2015faceattributes,
94
+ title = {Deep Learning Face Attributes in the Wild},
95
+ author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
96
+ booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
97
+ year = {2015}
98
+ }
99
+ ```
100
+
101
+ ### Codes and Libraries
102
+
103
+ ```text
104
+ @misc{Dayma_DALL路E_Mini_2021,
105
+ author = {Dayma, Boris and Patil, Suraj and Cuenca, Pedro and Saifullah, Khalid and Abraham, Tanishq and L锚 Kh岷痗, Ph煤c and Melas, Luke and Ghosh, Ritobrata},
106
+ doi = {10.5281/zenodo.5146400},
107
+ month = {7},
108
+ title = {DALL路E Mini},
109
+ url = {https://github.com/borisdayma/dalle-mini},
110
+ year = {2021}
111
+ }
112
+ ```
113
+
114
+ ```text
115
+ @software{jax2018github,
116
+ author = {James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson and Chris Leary and Dougal Maclaurin and George Necula and Adam Paszke and Jake Vander{P}las and Skye Wanderman-{M}ilne and Qiao Zhang},
117
+ title = {{JAX}: composable transformations of {P}ython+{N}um{P}y programs},
118
+ url = {http://github.com/google/jax},
119
+ version = {0.3.13},
120
+ year = {2018},
121
+ }
122
+ ```
123
+
124
+ ```text
125
+ @misc{esser2020taming,
126
+ title={Taming Transformers for High-Resolution Image Synthesis},
127
+ author={Patrick Esser and Robin Rombach and Bj枚rn Ommer},
128
+ year={2020},
129
+ eprint={2012.09841},
130
+ archivePrefix={arXiv},
131
+ primaryClass={cs.CV}
132
+ }
133
+ ```
134
+
135
+ ### Others
136
+
137
+ - [Gradio documentation](https://gradio.app/docs)
138
+
139
+ ### Extra Explanation
140
+
141
+ - [DALL-E Mini: Powerful image generation in a tiny model](https://blog.paperspace.com/dalle-mini/)
142
+ - [How DALL-E Mini Works](https://towardsdatascience.com/understanding-how-dall-e-mini-works-114048912b3b)
143
+ - [Fine-tuning DALL路E Mini (Craiyon) to Generate Blogpost Images](https://medium.com/@turc.raluca/fine-tuning-dall-e-mini-craiyon-to-generate-blogpost-images-32903cc7aa52)
144
+ - [Talks S2E1: DALL路E mini - Generate images from a text prompt](https://www.youtube.com/watch?v=-tMnGA4x3kA)
145
+ - [DALL-E mini explained | min(DALL-E) | Craiyon | ML Coding Series](https://www.youtube.com/watch?v=x_8uHX5KngE)
146
+
147
+ ---
148
+
149
+ ## Tools Used
150
+
151
+ - [Google Colab](https://colab.research.google.com/)
152
+ - [Paperspace Gradient](https://www.paperspace.com/gradient)
153
+ - [Kaggle](https://www.kaggle.com/)
154
+ - [Figma](https://www.figma.com/)
155
+ - [Visual Studio Code Space in GitHub](https://github.com/)
156
+ - [Weights & Biases](https://wandb.ai/home)