Diffusers documentation
Denoising Diffusion Probabilistic Models (DDPM)
Denoising Diffusion Probabilistic Models (DDPM)
Overview
Denoising Diffusion Probabilistic Models (DDPM) by Jonathan Ho, Ajay Jain and Pieter Abbeel proposes the diffusion based model of the same name, but in the context of the 🤗 Diffusers library, DDPM refers to the discrete denoising scheduler from the paper as well as the pipeline.
The abstract of the paper is the following:
We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN.
The original paper can be found here.
DDPMScheduler
class diffusers.DDPMScheduler
< source >( num_train_timesteps: int = 1000 beta_start: float = 0.0001 beta_end: float = 0.02 beta_schedule: str = 'linear' trained_betas: typing.Union[numpy.ndarray, typing.List[float], NoneType] = None variance_type: str = 'fixed_small' clip_sample: bool = True prediction_type: str = 'epsilon' thresholding: bool = False dynamic_thresholding_ratio: float = 0.995 clip_sample_range: float = 1.0 sample_max_value: float = 1.0 )
Parameters
- 
							num_train_timesteps (int) — number of diffusion steps used to train the model.
- 
							beta_start (float) — the startingbetavalue of inference.
- 
							beta_end (float) — the finalbetavalue.
- 
							beta_schedule (str) — the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose fromlinear,scaled_linear, orsquaredcos_cap_v2.
- 
							trained_betas (np.ndarray, optional) — option to pass an array of betas directly to the constructor to bypassbeta_start,beta_endetc.
- 
							variance_type (str) — options to clip the variance used when adding noise to the denoised sample. Choose fromfixed_small,fixed_small_log,fixed_large,fixed_large_log,learnedorlearned_range.
- 
							clip_sample (bool, defaultTrue) — option to clip predicted sample for numerical stability.
- 
							clip_sample_range (float, default1.0) — the maximum magnitude for sample clipping. Valid only whenclip_sample=True.
- 
							prediction_type (str, defaultepsilon, optional) — prediction type of the scheduler function, one ofepsilon(predicting the noise of the diffusion process),sample(directly predicting the noisy sample) orv_prediction` (see section 2.4 https://imagen.research.google/video/paper.pdf)
- 
							thresholding (bool, defaultFalse) — whether to use the “dynamic thresholding” method (introduced by Imagen, https://arxiv.org/abs/2205.11487). Note that the thresholding method is unsuitable for latent-space diffusion models (such as stable-diffusion).
- 
							dynamic_thresholding_ratio (float, default0.995) — the ratio for the dynamic thresholding method. Default is0.995, the same as Imagen (https://arxiv.org/abs/2205.11487). Valid only whenthresholding=True.
- 
							sample_max_value (float, default1.0) — the threshold value for dynamic thresholding. Valid only whenthresholding=True.
Denoising diffusion probabilistic models (DDPMs) explores the connections between denoising score matching and Langevin dynamics sampling.
~ConfigMixin takes care of storing all config attributes that are passed in the scheduler’s __init__
function, such as num_train_timesteps. They can be accessed via scheduler.config.num_train_timesteps.
SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and
from_pretrained() functions.
For more details, see the original paper: https://arxiv.org/abs/2006.11239
scale_model_input
< source >(
			sample: FloatTensor
				timestep: typing.Optional[int] = None
				
			)
			→
				torch.FloatTensor
Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.
set_timesteps
< source >( num_inference_steps: int device: typing.Union[str, torch.device] = None )
Sets the discrete timesteps used for the diffusion chain. Supporting function to be run before inference.
step
< source >(
			model_output: FloatTensor
				timestep: int
				sample: FloatTensor
				generator = None
				return_dict: bool = True
				
			)
			→
				~schedulers.scheduling_utils.DDPMSchedulerOutput or tuple
Parameters
- 
							model_output (torch.FloatTensor) — direct output from learned diffusion model.
- 
							timestep (int) — current discrete timestep in the diffusion chain.
- 
							sample (torch.FloatTensor) — current instance of sample being created by diffusion process. generator — random number generator.
- 
							return_dict (bool) — option for returning tuple rather than DDPMSchedulerOutput class
Returns
~schedulers.scheduling_utils.DDPMSchedulerOutput or tuple
~schedulers.scheduling_utils.DDPMSchedulerOutput if return_dict is True, otherwise a tuple. When
returning a tuple, the first element is the sample tensor.
Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion process from the learned model outputs (most often the predicted noise).