|
--- |
|
license: mit |
|
library_name: pytorch |
|
tags: |
|
- image-generation |
|
- gan |
|
- stylegan |
|
- stylegan3 |
|
- nvidia |
|
base_model: |
|
- stylegan3-t-ffhqu-256x256 |
|
--- |
|
# Male Faces Generator (StyleGAN3 by NVIDIA) |
|
|
|
## Now packed with 50% more bros! |
|
|
|
![sampleimages](SampleImages.jpg) |
|
|
|
This is a [StyleGAN3 PyTorch](https://github.com/NVlabs/stylegan3) model trained on 50k faces of men scraped from pinterest and its associated bias. |
|
|
|
### Usage |
|
|
|
If you want to generate your own images, follow the steps on [StyleGAN3 PyTorch](https://github.com/NVlabs/stylegan3) under "Getting started." Run gen_images.py and specify your seed and truncation. |
|
|
|
#### Sample Images |
|
Sample images were generated directly from model with no-postprocessing. Images were generated using a truncation of: '0.7' |
|
Recommend using [CodeFormer](https://github.com/sczhou/CodeFormer) to restore faces and upscale to desired resolution (I like upscaling by 2x to 512x512). |
|
|
|
### Dataset & Model Details |
|
|
|
Dataset was scraped from pinterest and cropped using dlib. Further symmetry and rotation filtering applied via U2Net and MTCNN. |
|
|
|
Training was done locally using StyleGAN3 with an RTX 4090 and cuda 11.8. Training took roughly 100 hours but hyperparameter tuning/restarting was modified (many times) about 3/4 of the way through. |
|
|
|
You have two versions to choose from, depending on your preference. |
|
|
|
#### Main Version (v2.0.0) |
|
- Configuration: `stylegan3-t` |
|
- GPUs: `1` |
|
- Batch Size: `32` |
|
- Gamma: `6` then gradually down to `0.25` after ~5000 kimg |
|
- Final tick: `~500` |
|
- Image Resolution: '256x256' |
|
- Final fid50k_full value (this pickle): `10.161478712247181` |
|
|
|
#### Alternate Version (v2.1.0_alt) |
|
- Dataset was reduced to 60% (better quality) |
|
- Trained for another 2000 kimg |
|
- Final fid50k_full value (this pickle): `5.079479576662786` |
|
|
|
#### Differences between main and alternate version |
|
- Main Version: Better variety and realism, but higher chance of generating bad "outlier" faces |
|
- Alternate Version: Better Overall Quality, less variety in faces (less chances of generating bad faces) |
|
|
|
#### Differences from v1.0.0 (10 months in the making) |
|
- Noticeable improvements in quality and detail |
|
- More variety in hairstyles, hair colors, eye colors, and even accessories (earrings and sunglasses) |
|
- Expressive Bros! These bros are not ashamed to smile! |
|
- Improved Teeth! Gone are the days of throwing away gens due to crooked teeth. These bros are ready to show their pearly whites. |
|
- IDK they just look more real, trust me bro |
|
|
|
Technical Stuff |
|
- Training was restarted from scratch using stylegan3-t-ffhqu-256x256 |
|
- I trained on a subset of the original BroGAN dataset filtered for quality (30k images) |
|
- I did not reduce the dataset halfway as I wanted to keep the facial variety |
|
- Training from scratch using ffhq model improved face and teeth shape and color dramatically) |
|
- Mirroring (xflips) was turned off which improved quality due to the inherent asymmetry in the dataset |
|
- I started off with a gamma of 6 then down to 2, then down to 1, 0.5, and 0.25 |
|
- I used the default stylegan3-t config until about halfway where I reduced the learning rate to `glr=0.0005` and `dlr=0.0001` |
|
- The last 2 fid (fid50k: 12 --> 10) takes the longest but is worth it in improving quality |
|
|
|
### Credits |
|
|
|
Please don't forget to give credit if you decide to share/distribute this model. Training these take a lot of time and effort :) |
|
Thanks to all the people who offered their feedback on the model. `Your feedback matters!` |