Spaces:
Runtime error
Runtime error
<!-- Copyright 2024 NVIDIA CORPORATION & AFFILIATES | |
Licensed under the Apache License, Version 2.0 (the "License"); | |
you may not use this file except in compliance with the License. | |
You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software | |
distributed under the License is distributed on an "AS IS" BASIS, | |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
See the License for the specific language governing permissions and | |
limitations under the License. | |
SPDX-License-Identifier: Apache-2.0 --> | |
## 🔥 ControlNet | |
We incorporate a ControlNet-like(https://github.com/lllyasviel/ControlNet) module enables fine-grained control over text-to-image diffusion models. We implement a ControlNet-Transformer architecture, specifically tailored for Transformers, achieving explicit controllability alongside high-quality image generation. | |
<p align="center"> | |
<img src="https://raw.githubusercontent.com/NVlabs/Sana/refs/heads/page/asset/content/controlnet/sana_controlnet.jpg" height=480> | |
</p> | |
## Inference of `Sana + ControlNet` | |
### 1). Gradio Interface | |
```bash | |
python app/app_sana_controlnet_hed.py \ | |
--config configs/sana_controlnet_config/Sana_1600M_1024px_controlnet_bf16.yaml \ | |
--model_path hf://Efficient-Large-Model/Sana_1600M_1024px_BF16_ControlNet_HED/checkpoints/Sana_1600M_1024px_BF16_ControlNet_HED.pth | |
``` | |
<p align="center" border-raduis="10px"> | |
<img src="https://nvlabs.github.io/Sana/asset/content/controlnet/controlnet_app.jpg" width="90%" alt="teaser_page2"/> | |
</p> | |
### 2). Inference with JSON file | |
```bash | |
python tools/controlnet/inference_controlnet.py \ | |
--config configs/sana_controlnet_config/Sana_1600M_1024px_controlnet_bf16.yaml \ | |
--model_path hf://Efficient-Large-Model/Sana_1600M_1024px_BF16_ControlNet_HED/checkpoints/Sana_1600M_1024px_BF16_ControlNet_HED.pth \ | |
--json_file asset/controlnet/samples_controlnet.json | |
``` | |
### 3). Inference code snap | |
```python | |
import torch | |
from PIL import Image | |
from app.sana_controlnet_pipeline import SanaControlNetPipeline | |
device = "cuda" if torch.cuda.is_available() else "cpu" | |
pipe = SanaControlNetPipeline("configs/sana_controlnet_config/Sana_1600M_1024px_controlnet_bf16.yaml") | |
pipe.from_pretrained("hf://Efficient-Large-Model/Sana_1600M_1024px_BF16_ControlNet_HED/checkpoints/Sana_1600M_1024px_BF16_ControlNet_HED.pth") | |
ref_image = Image.open("asset/controlnet/ref_images/A transparent sculpture of a duck made out of glass. The sculpture is in front of a painting of a la.jpg") | |
prompt = "A transparent sculpture of a duck made out of glass. The sculpture is in front of a painting of a landscape." | |
images = pipe( | |
prompt=prompt, | |
ref_image=ref_image, | |
guidance_scale=4.5, | |
num_inference_steps=10, | |
sketch_thickness=2, | |
generator=torch.Generator(device=device).manual_seed(0), | |
) | |
``` | |
## Training of `Sana + ControlNet` | |
### Coming soon | |