quartzermz
/

BroGANv2.0.0

image-generation

Model card Files Files and versions Community

BroGANv2.0.0 / README.md

quartzermz's picture

Update README.md

f67a1fb verified 9 months ago

|

history blame contribute delete

3.47 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- image-generation
	- gan
	- stylegan
	- stylegan3
	- nvidia
	base_model:
	- stylegan3-t-ffhqu-256x256
	---
	# Male Faces Generator (StyleGAN3 by NVIDIA)

	## Now packed with 50% more bros!

	![sampleimages](SampleImages.jpg)

	This is a [StyleGAN3 PyTorch](https://github.com/NVlabs/stylegan3) model trained on 50k faces of men scraped from pinterest and its associated bias.

	### Usage

	If you want to generate your own images, follow the steps on [StyleGAN3 PyTorch](https://github.com/NVlabs/stylegan3) under "Getting started." Run gen_images.py and specify your seed and truncation.

	#### Sample Images
	Sample images were generated directly from model with no-postprocessing. Images were generated using a truncation of: '0.7'
	Recommend using [CodeFormer](https://github.com/sczhou/CodeFormer) to restore faces and upscale to desired resolution (I like upscaling by 2x to 512x512).

	### Dataset & Model Details

	Dataset was scraped from pinterest and cropped using dlib. Further symmetry and rotation filtering applied via U2Net and MTCNN.

	Training was done locally using StyleGAN3 with an RTX 4090 and cuda 11.8. Training took roughly 100 hours but hyperparameter tuning/restarting was modified (many times) about 3/4 of the way through.

	You have two versions to choose from, depending on your preference.

	#### Main Version (v2.0.0)
	- Configuration: `stylegan3-t`
	- GPUs: `1`
	- Batch Size: `32`
	- Gamma: `6` then gradually down to `0.25` after ~5000 kimg
	- Final tick: `~500`
	- Image Resolution: '256x256'
	- Final fid50k_full value (this pickle): `10.161478712247181`

	#### Alternate Version (v2.1.0_alt)
	- Dataset was reduced to 60% (better quality)
	- Trained for another 2000 kimg
	- Final fid50k_full value (this pickle): `5.079479576662786`

	#### Differences between main and alternate version
	- Main Version: Better variety and realism, but higher chance of generating bad "outlier" faces
	- Alternate Version: Better Overall Quality, less variety in faces (less chances of generating bad faces)

	#### Differences from v1.0.0 (10 months in the making)
	- Noticeable improvements in quality and detail
	- More variety in hairstyles, hair colors, eye colors, and even accessories (earrings and sunglasses)
	- Expressive Bros! These bros are not ashamed to smile!
	- Improved Teeth! Gone are the days of throwing away gens due to crooked teeth. These bros are ready to show their pearly whites.
	- IDK they just look more real, trust me bro

	Technical Stuff
	- Training was restarted from scratch using stylegan3-t-ffhqu-256x256
	- I trained on a subset of the original BroGAN dataset filtered for quality (30k images)
	- I did not reduce the dataset halfway as I wanted to keep the facial variety
	- Training from scratch using ffhq model improved face and teeth shape and color dramatically)
	- Mirroring (xflips) was turned off which improved quality due to the inherent asymmetry in the dataset
	- I started off with a gamma of 6 then down to 2, then down to 1, 0.5, and 0.25
	- I used the default stylegan3-t config until about halfway where I reduced the learning rate to `glr=0.0005` and `dlr=0.0001`
	- The last 2 fid (fid50k: 12 --> 10) takes the longest but is worth it in improving quality

	### Credits

	Please don't forget to give credit if you decide to share/distribute this model. Training these take a lot of time and effort :)
	Thanks to all the people who offered their feedback on the model. `Your feedback matters!`