|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- thesantatitan/objaverse_orbit_renders |
|
base_model: |
|
- black-forest-labs/FLUX.1-dev |
|
--- |
|
# (WIP) |
|
|
|
This is a finetune of FLUX.1-dev. This model aims to generate multiple views of an object given a single view. |
|
|
|
The dataset was curated by re-rendering and cleaning up the Objaverse dataset. |
|
Example image from the dataset: |
|
 |
|
|
|
I tried multiple training runs with different hyperparameters, and It seems that the model just learns the output structure at a very high level, but doesn't learn the details. |
|
Here are some example outputs when asked to generate multiple views of a bird |
|
 |
|
 |
|
|
|
|
|
Loss graphs for two best training runs |
|
 |
|
|
|
## Misc Details |
|
|
|
The training method here is similar to FLUX Depth Control, and FLUX Canny Control. |
|
The conditioning image is added as extra channels in the input, and the model is asked to denoise the noisy channels. |
|
|
|
Thanks to [Modal](https://modal.com) for sponsoring the compute for this. |
|
|
|
|
|
|