Diffusers documentation
Inverse Multistep DPM-Solver (DPMSolverMultistepInverse)
Inverse Multistep DPM-Solver (DPMSolverMultistepInverse)
Overview
This scheduler is the inverted scheduler of DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps and DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models by Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. The implementation is mostly based on the DDIM inversion definition of Null-text Inversion for Editing Real Images using Guided Diffusion Models and the ad-hoc notebook implementation for DiffEdit latent inversion here.
DPMSolverMultistepInverseScheduler
class diffusers.DPMSolverMultistepInverseScheduler
< source >( num_train_timesteps: int = 1000 beta_start: float = 0.0001 beta_end: float = 0.02 beta_schedule: str = 'linear' trained_betas: typing.Union[numpy.ndarray, typing.List[float], NoneType] = None solver_order: int = 2 prediction_type: str = 'epsilon' thresholding: bool = False dynamic_thresholding_ratio: float = 0.995 sample_max_value: float = 1.0 algorithm_type: str = 'dpmsolver++' solver_type: str = 'midpoint' lower_order_final: bool = True use_karras_sigmas: typing.Optional[bool] = False lambda_min_clipped: float = -inf variance_type: typing.Optional[str] = None timestep_spacing: str = 'linspace' steps_offset: int = 0 )
Parameters
-
num_train_timesteps (
int) — number of diffusion steps used to train the model. -
beta_start (
float) — the startingbetavalue of inference. -
beta_end (
float) — the finalbetavalue. -
beta_schedule (
str) — the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose fromlinear,scaled_linear, orsquaredcos_cap_v2. -
trained_betas (
np.ndarray, optional) — option to pass an array of betas directly to the constructor to bypassbeta_start,beta_endetc. -
solver_order (
int, default2) — the order of DPM-Solver; can be1or2or3. We recommend to usesolver_order=2for guided sampling, andsolver_order=3for unconditional sampling. -
prediction_type (
str, defaultepsilon, optional) — prediction type of the scheduler function, one ofepsilon(predicting the noise of the diffusion process),sample(directly predicting the noisy sample) orv_prediction` (see section 2.4 https://imagen.research.google/video/paper.pdf) -
thresholding (
bool, defaultFalse) — whether to use the “dynamic thresholding” method (introduced by Imagen, https://arxiv.org/abs/2205.11487). For pixel-space diffusion models, you can set bothalgorithm_type=dpmsolver++andthresholding=Trueto use the dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as stable-diffusion). -
dynamic_thresholding_ratio (
float, default0.995) — the ratio for the dynamic thresholding method. Default is0.995, the same as Imagen (https://arxiv.org/abs/2205.11487). -
sample_max_value (
float, default1.0) — the threshold value for dynamic thresholding. Valid only whenthresholding=Trueandalgorithm_type="dpmsolver++. -
algorithm_type (
str, defaultdpmsolver++) — the algorithm type for the solver. Eitherdpmsolverordpmsolver++orsde-dpmsolverorsde-dpmsolver++. Thedpmsolvertype implements the algorithms in https://arxiv.org/abs/2206.00927, and thedpmsolver++type implements the algorithms in https://arxiv.org/abs/2211.01095. We recommend to usedpmsolver++orsde-dpmsolver++withsolver_order=2for guided sampling (e.g. stable-diffusion). -
solver_type (
str, defaultmidpoint) — the solver type for the second-order solver. Eithermidpointorheun. The solver type slightly affects the sample quality, especially for small number of steps. We empirically find thatmidpointsolvers are slightly better, so we recommend to use themidpointtype. -
lower_order_final (
bool, defaultTrue) — whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically find this trick can stabilize the sampling of DPM-Solver for steps < 15, especially for steps <= 10. -
use_karras_sigmas (
bool, optional, defaults toFalse) — This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf. -
lambda_min_clipped (
float, default-inf) — the clipping threshold for the minimum value of lambda(t) for numerical stability. This is critical for cosine (squaredcos_cap_v2) noise schedule. -
variance_type (
str, optional) — Set to “learned” or “learned_range” for diffusion models that predict variance. For example, OpenAI’s guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the Gaussian distribution in the model’s output. DPM-Solver only needs the “mean” output because it is based on diffusion ODEs. whether the model’s output contains the predicted Gaussian variance. For example, OpenAI’s guided-diffusion (https://github.com/openai/guided-diffusion) predicts both mean and variance of the Gaussian distribution in the model’s output. DPM-Solver only needs the “mean” output because it is based on diffusion ODEs. -
timestep_spacing (
str, default"linspace") — The way the timesteps should be scaled. Refer to Table 2. of Common Diffusion Noise Schedules and Sample Steps are Flawed for more information. -
steps_offset (
int, default0) — an offset added to the inference steps. You can use a combination ofoffset=1andset_alpha_to_one=False, to make the last step use step 0 for the previous alpha product, as done in stable diffusion.
DPMSolverMultistepInverseScheduler is the reverse scheduler of DPMSolverMultistepScheduler.
We also support the βdynamic thresholdingβ method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space
diffusion models, you can set both algorithm_type="dpmsolver++" and thresholding=True to use the dynamic
thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as
stable-diffusion).
~ConfigMixin takes care of storing all config attributes that are passed in the schedulerβs __init__
function, such as num_train_timesteps. They can be accessed via scheduler.config.num_train_timesteps.
SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and
from_pretrained() functions.
convert_model_output
< source >(
model_output: FloatTensor
timestep: int
sample: FloatTensor
)
β
torch.FloatTensor
Parameters
-
model_output (
torch.FloatTensor) — direct output from learned diffusion model. -
timestep (
int) — current discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor) — current instance of sample being created by diffusion process.
Returns
torch.FloatTensor
the converted model output.
Convert the model output to the corresponding type that the algorithm (DPM-Solver / DPM-Solver++) needs.
DPM-Solver is designed to discretize an integral of the noise prediction model, and DPM-Solver++ is designed to discretize an integral of the data prediction model. So we need to first convert the model output to the corresponding type to match the algorithm.
Note that the algorithm type and the model type is decoupled. That is to say, we can use either DPM-Solver or DPM-Solver++ for both noise prediction model and data prediction model.
dpm_solver_first_order_update
< source >(
model_output: FloatTensor
timestep: int
prev_timestep: int
sample: FloatTensor
noise: typing.Optional[torch.FloatTensor] = None
)
β
torch.FloatTensor
Parameters
-
model_output (
torch.FloatTensor) — direct output from learned diffusion model. -
timestep (
int) — current discrete timestep in the diffusion chain. -
prev_timestep (
int) — previous discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor) — current instance of sample being created by diffusion process.
Returns
torch.FloatTensor
the sample tensor at the previous timestep.
One step for the first-order DPM-Solver (equivalent to DDIM).
See https://arxiv.org/abs/2206.00927 for the detailed derivation.
multistep_dpm_solver_second_order_update
< source >(
model_output_list: typing.List[torch.FloatTensor]
timestep_list: typing.List[int]
prev_timestep: int
sample: FloatTensor
noise: typing.Optional[torch.FloatTensor] = None
)
β
torch.FloatTensor
Parameters
-
model_output_list (
List[torch.FloatTensor]) — direct outputs from learned diffusion model at current and latter timesteps. -
timestep (
int) — current and latter discrete timestep in the diffusion chain. -
prev_timestep (
int) — previous discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor) — current instance of sample being created by diffusion process.
Returns
torch.FloatTensor
the sample tensor at the previous timestep.
One step for the second-order multistep DPM-Solver.
multistep_dpm_solver_third_order_update
< source >(
model_output_list: typing.List[torch.FloatTensor]
timestep_list: typing.List[int]
prev_timestep: int
sample: FloatTensor
)
β
torch.FloatTensor
Parameters
-
model_output_list (
List[torch.FloatTensor]) — direct outputs from learned diffusion model at current and latter timesteps. -
timestep (
int) — current and latter discrete timestep in the diffusion chain. -
prev_timestep (
int) — previous discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor) — current instance of sample being created by diffusion process.
Returns
torch.FloatTensor
the sample tensor at the previous timestep.
One step for the third-order multistep DPM-Solver.
scale_model_input
< source >(
sample: FloatTensor
*args
**kwargs
)
β
torch.FloatTensor
Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.
set_timesteps
< source >( num_inference_steps: int = None device: typing.Union[str, torch.device] = None )
Sets the timesteps used for the diffusion chain. Supporting function to be run before inference.
step
< source >(
model_output: FloatTensor
timestep: int
sample: FloatTensor
generator = None
return_dict: bool = True
)
β
~scheduling_utils.SchedulerOutput or tuple
Parameters
-
model_output (
torch.FloatTensor) — direct output from learned diffusion model. -
timestep (
int) — current discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor) — current instance of sample being created by diffusion process. -
return_dict (
bool) — option for returning tuple rather than SchedulerOutput class
Returns
~scheduling_utils.SchedulerOutput or tuple
~scheduling_utils.SchedulerOutput if return_dict is
True, otherwise a tuple. When returning a tuple, the first element is the sample tensor.
Step function propagating the sample with the multistep DPM-Solver.