Spaces:
Runtime error
Runtime error
<!--Copyright 2023 The HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations under the License. | |
--> | |
<p align="center"> | |
<br> | |
<img src="https://raw.githubusercontent.com/huggingface/diffusers/77aadfee6a891ab9fcfb780f87c693f7a5beeb8e/docs/source/imgs/diffusers_library.jpg" width="400"/> | |
<br> | |
</p> | |
# Diffusers | |
๐ค Diffusers๋ ์ด๋ฏธ์ง, ์ค๋์ค, ์ฌ์ง์ด ๋ถ์์ 3D ๊ตฌ์กฐ๋ฅผ ์์ฑํ๊ธฐ ์ํ ์ต์ฒจ๋จ ์ฌ์ ํ๋ จ๋ diffusion ๋ชจ๋ธ์ ์ํ ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ ๋๋ค. ๊ฐ๋จํ ์ถ๋ก ์๋ฃจ์ ์ ์ฐพ๊ณ ์๋ , ์์ฒด diffusion ๋ชจ๋ธ์ ํ๋ จํ๊ณ ์ถ๋ , ๐ค Diffusers๋ ๋ ๊ฐ์ง ๋ชจ๋๋ฅผ ์ง์ํ๋ ๋ชจ๋์ ํด๋ฐ์ค์ ๋๋ค. ์ ํฌ ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ [์ฑ๋ฅ๋ณด๋ค ์ฌ์ฉ์ฑ](conceptual/philosophy#usability-over-performance), [๊ฐํธํจ๋ณด๋ค ๋จ์ํจ](conceptual/philosophy#simple-over-easy), ๊ทธ๋ฆฌ๊ณ [์ถ์ํ๋ณด๋ค ์ฌ์ฉ์ ์ง์ ๊ฐ๋ฅ์ฑ](conceptual/philosophy#tweakable-contributorfriendly-over-abstraction)์ ์ค์ ์ ๋๊ณ ์ค๊ณ๋์์ต๋๋ค. | |
์ด ๋ผ์ด๋ธ๋ฌ๋ฆฌ์๋ ์ธ ๊ฐ์ง ์ฃผ์ ๊ตฌ์ฑ ์์๊ฐ ์์ต๋๋ค: | |
- ๋ช ์ค์ ์ฝ๋๋ง์ผ๋ก ์ถ๋ก ํ ์ ์๋ ์ต์ฒจ๋จ [diffusion ํ์ดํ๋ผ์ธ](api/pipelines/overview). | |
- ์์ฑ ์๋์ ํ์ง ๊ฐ์ ๊ท ํ์ ๋ง์ถ๊ธฐ ์ํด ์ํธ๊ตํ์ ์ผ๋ก ์ฌ์ฉํ ์ ์๋ [๋ ธ์ด์ฆ ์ค์ผ์ค๋ฌ](api/schedulers/overview). | |
- ๋น๋ฉ ๋ธ๋ก์ผ๋ก ์ฌ์ฉํ ์ ์๊ณ ์ค์ผ์ค๋ฌ์ ๊ฒฐํฉํ์ฌ ์์ฒด์ ์ธ end-to-end diffusion ์์คํ ์ ๋ง๋ค ์ ์๋ ์ฌ์ ํ์ต๋ [๋ชจ๋ธ](api/models). | |
<div class="mt-10"> | |
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5"> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./tutorials/tutorial_overview" | |
><div class="w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Tutorials</div> | |
<p class="text-gray-700">๊ฒฐ๊ณผ๋ฌผ์ ์์ฑํ๊ณ , ๋๋ง์ diffusion ์์คํ ์ ๊ตฌ์ถํ๊ณ , ํ์ฐ ๋ชจ๋ธ์ ํ๋ จํ๋ ๋ฐ ํ์ํ ๊ธฐ๋ณธ ๊ธฐ์ ์ ๋ฐฐ์๋ณด์ธ์. ๐ค Diffusers๋ฅผ ์ฒ์ ์ฌ์ฉํ๋ ๊ฒฝ์ฐ ์ฌ๊ธฐ์์ ์์ํ๋ ๊ฒ์ด ์ข์ต๋๋ค!</p> | |
</a> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./using-diffusers/loading_overview" | |
><div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">How-to guides</div> | |
<p class="text-gray-700">ํ์ดํ๋ผ์ธ, ๋ชจ๋ธ, ์ค์ผ์ค๋ฌ๋ฅผ ๋ก๋ํ๋ ๋ฐ ๋์์ด ๋๋ ์ค์ฉ์ ์ธ ๊ฐ์ด๋์ ๋๋ค. ๋ํ ํน์ ์์ ์ ํ์ดํ๋ผ์ธ์ ์ฌ์ฉํ๊ณ , ์ถ๋ ฅ ์์ฑ ๋ฐฉ์์ ์ ์ดํ๊ณ , ์ถ๋ก ์๋์ ๋ง๊ฒ ์ต์ ํํ๊ณ , ๋ค์ํ ํ์ต ๊ธฐ๋ฒ์ ์ฌ์ฉํ๋ ๋ฐฉ๋ฒ๋ ๋ฐฐ์ธ ์ ์์ต๋๋ค.</p> | |
</a> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./conceptual/philosophy" | |
><div class="w-full text-center bg-gradient-to-br from-pink-400 to-pink-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Conceptual guides</div> | |
<p class="text-gray-700">๋ผ์ด๋ธ๋ฌ๋ฆฌ๊ฐ ์ ์ด๋ฐ ๋ฐฉ์์ผ๋ก ์ค๊ณ๋์๋์ง ์ดํดํ๊ณ , ๋ผ์ด๋ธ๋ฌ๋ฆฌ ์ด์ฉ์ ๋ํ ์ค๋ฆฌ์ ๊ฐ์ด๋๋ผ์ธ๊ณผ ์์ ๊ตฌํ์ ๋ํด ์์ธํ ์์๋ณด์ธ์.</p> | |
</a> | |
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./api/models" | |
><div class="w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">Reference</div> | |
<p class="text-gray-700">๐ค Diffusers ํด๋์ค ๋ฐ ๋ฉ์๋์ ์๋ ๋ฐฉ์์ ๋ํ ๊ธฐ์ ์ค๋ช .</p> | |
</a> | |
</div> | |
</div> | |
## Supported pipelines | |
| Pipeline | Paper/Repository | Tasks | | |
|---|---|:---:| | |
| [alt_diffusion](./api/pipelines/alt_diffusion) | [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) | Image-to-Image Text-Guided Generation | | |
| [audio_diffusion](./api/pipelines/audio_diffusion) | [Audio Diffusion](https://github.com/teticio/audio-diffusion.git) | Unconditional Audio Generation | | |
| [controlnet](./api/pipelines/stable_diffusion/controlnet) | [Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543) | Image-to-Image Text-Guided Generation | | |
| [cycle_diffusion](./api/pipelines/cycle_diffusion) | [Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance](https://arxiv.org/abs/2210.05559) | Image-to-Image Text-Guided Generation | | |
| [dance_diffusion](./api/pipelines/dance_diffusion) | [Dance Diffusion](https://github.com/williamberman/diffusers.git) | Unconditional Audio Generation | | |
| [ddpm](./api/pipelines/ddpm) | [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation | | |
| [ddim](./api/pipelines/ddim) | [Denoising Diffusion Implicit Models](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation | | |
| [if](./if) | [**IF**](./api/pipelines/if) | Image Generation | | |
| [if_img2img](./if) | [**IF**](./api/pipelines/if) | Image-to-Image Generation | | |
| [if_inpainting](./if) | [**IF**](./api/pipelines/if) | Image-to-Image Generation | | |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation | | |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752)| Super Resolution Image-to-Image | | |
| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation | | |
| [paint_by_example](./api/pipelines/paint_by_example) | [Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227) | Image-Guided Image Inpainting | | |
| [pndm](./api/pipelines/pndm) | [Pseudo Numerical Methods for Diffusion Models on Manifolds](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation | | |
| [score_sde_ve](./api/pipelines/score_sde_ve) | [Score-Based Generative Modeling through Stochastic Differential Equations](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation | | |
| [score_sde_vp](./api/pipelines/score_sde_vp) | [Score-Based Generative Modeling through Stochastic Differential Equations](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation | | |
| [semantic_stable_diffusion](./api/pipelines/semantic_stable_diffusion) | [Semantic Guidance](https://arxiv.org/abs/2301.12247) | Text-Guided Generation | | |
| [stable_diffusion_text2img](./api/pipelines/stable_diffusion/text2img) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation | | |
| [stable_diffusion_img2img](./api/pipelines/stable_diffusion/img2img) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation | | |
| [stable_diffusion_inpaint](./api/pipelines/stable_diffusion/inpaint) | [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting | | |
| [stable_diffusion_panorama](./api/pipelines/stable_diffusion/panorama) | [MultiDiffusion](https://multidiffusion.github.io/) | Text-to-Panorama Generation | | |
| [stable_diffusion_pix2pix](./api/pipelines/stable_diffusion/pix2pix) | [InstructPix2Pix: Learning to Follow Image Editing Instructions](https://arxiv.org/abs/2211.09800) | Text-Guided Image Editing| | |
| [stable_diffusion_pix2pix_zero](./api/pipelines/stable_diffusion/pix2pix_zero) | [Zero-shot Image-to-Image Translation](https://pix2pixzero.github.io/) | Text-Guided Image Editing | | |
| [stable_diffusion_attend_and_excite](./api/pipelines/stable_diffusion/attend_and_excite) | [Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models](https://arxiv.org/abs/2301.13826) | Text-to-Image Generation | | |
| [stable_diffusion_self_attention_guidance](./api/pipelines/stable_diffusion/self_attention_guidance) | [Improving Sample Quality of Diffusion Models Using Self-Attention Guidance](https://arxiv.org/abs/2210.00939) | Text-to-Image Generation Unconditional Image Generation | | |
| [stable_diffusion_image_variation](./stable_diffusion/image_variation) | [Stable Diffusion Image Variations](https://github.com/LambdaLabsML/lambda-diffusers#stable-diffusion-image-variations) | Image-to-Image Generation | | |
| [stable_diffusion_latent_upscale](./stable_diffusion/latent_upscale) | [Stable Diffusion Latent Upscaler](https://twitter.com/StabilityAI/status/1590531958815064065) | Text-Guided Super Resolution Image-to-Image | | |
| [stable_diffusion_model_editing](./api/pipelines/stable_diffusion/model_editing) | [Editing Implicit Assumptions in Text-to-Image Diffusion Models](https://time-diffusion.github.io/) | Text-to-Image Model Editing | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Depth-Conditional Stable Diffusion](https://github.com/Stability-AI/stablediffusion#depth-conditional-stable-diffusion) | Depth-to-Image Generation | | |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [Stable Diffusion 2](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image | | |
| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [Safe Stable Diffusion](https://arxiv.org/abs/2211.05105) | Text-Guided Generation | | |
| [stable_unclip](./stable_unclip) | Stable unCLIP | Text-to-Image Generation | | |
| [stable_unclip](./stable_unclip) | Stable unCLIP | Image-to-Image Text-Guided Generation | | |
| [stochastic_karras_ve](./api/pipelines/stochastic_karras_ve) | [Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation | | |
| [text_to_video_sd](./api/pipelines/text_to_video) | [Modelscope's Text-to-video-synthesis Model in Open Domain](https://modelscope.cn/models/damo/text-to-video-synthesis/summary) | Text-to-Video Generation | | |
| [unclip](./api/pipelines/unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125)(implementation by [kakaobrain](https://github.com/kakaobrain/karlo)) | Text-to-Image Generation | | |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation | | |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation | | |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation | | |
| [vq_diffusion](./api/pipelines/vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation | | |