(WIP)
This is a finetune of FLUX.1-dev. This model aims to generate multiple views of an object given a single view.
The dataset was curated by re-rendering and cleaning up the Objaverse dataset.
Example image from the dataset:
I tried multiple training runs with different hyperparameters, and It seems that the model just learns the output structure at a very high level, but doesn't learn the details.
Here are some example outputs when asked to generate multiple views of a bird
Loss graphs for two best training runs
Misc Details
The training method here is similar to FLUX Depth Control, and FLUX Canny Control. The conditioning image is added as extra channels in the input, and the model is asked to denoise the noisy channels.
Thanks to Modal for sponsoring the compute for this.
Model tree for thesantatitan/flux-control-orbit
Base model
black-forest-labs/FLUX.1-dev