---
license: apache-2.0
datasets:
- thesantatitan/objaverse_orbit_renders
base_model:
- black-forest-labs/FLUX.1-dev
---
# (WIP)

This is a finetune of FLUX.1-dev. This model aims to generate multiple views of an object given a single view.

The dataset was curated by re-rendering and cleaning up the Objaverse dataset. 
Example image from the dataset:
![Dataset Example Image](https://huggingface.co/thesantatitan/flux-control-orbit/resolve/main/AGV_vUfbAOm3VgZZlfedG82Nx5HgSwpE9SflOcJMaA5iUJlgqsNmyWeYGrwgQwb1iqwm99NijPQnFF4urUzxFp1j6KtZDqDJp1E7s1IW4gv5D5Fd_a-EIiksnVor.png)

I tried multiple training runs with different hyperparameters, and It seems that the model just learns the output structure at a very high level, but doesn't learn the details.
Here are some example outputs when asked to generate multiple views of a bird 
![Bird Multiple views 1](https://huggingface.co/thesantatitan/flux-control-orbit/resolve/main/AGV_vUcYWYEdGywSzW22SdSv3gsHU_eJWSBZfsPNjkj3vOvkz3hNAyofYvyuFK4tea42abeExtPd7jnNF6zmkB5-4hv3111E-_omVOWGR89Pz-_7dAEFOxYxTPvK.png)
![Bird Multiple views 2](https://huggingface.co/thesantatitan/flux-control-orbit/resolve/main/validation_2199_e036f00f7f3f74d9b21e-2.png)


Loss graphs for two best training runs
![Loss](https://huggingface.co/thesantatitan/flux-control-orbit/resolve/main/Pasted%20Graphic.png)

## Misc Details

The training method here is similar to FLUX Depth Control, and FLUX Canny Control. 
The conditioning image is added as extra channels in the input, and the model is asked to denoise the noisy channels.

Thanks to [Modal](https://modal.com) for sponsoring the compute for this.