论文阅读笔记:Denoising Diffusion Implicit Models (1)

0、快速访问

论文阅读笔记:Denoising Diffusion Implicit Models (1)
论文阅读笔记:Denoising Diffusion Implicit Models (2)
论文阅读笔记:Denoising Diffusion Implicit Models (3)
论文阅读笔记:Denoising Diffusion Implicit Models (4)
论文阅读笔记:Denoising Diffusion Implicit Models (5)

1、参考来源

论文《Denoising Diffusion Implicit Models》
来源:ICLR2021
https://iclr.cc/virtual/2021/poster/2804
论文链接:https://arxiv.org/abs/2010.02502
代码链接:https://github.com/ermongroup/ddim

2、论文DDIM中符号不同给一些公式带来的一些变化

在论文DDPM《Denoising Diffusion Implicit Models》当中,前向传播过程的 q ( x t − 1 ∣ x t , x 0 ) ∼ N ( x t − 1 ; μ ~ t ( x t , x 0 ) , σ t ) q(x_{t-1}|x_t,x_0)\sim N\big(x_{t-1};\widetilde{\mu}_t(x_t,x_0),\sigma_t\big) q(xt1xt,x0)N(xt1;μ t(xt,x0),σt)。并且 μ ~ t ( x t , x 0 ) 和 σ t \widetilde{\mu}_t(x_t,x_0)和\sigma_t μ t(xt,x0)σt分别如公式(1)所示。
σ t = β t ⋅ ( 1 − α t − 1 ˉ ) ( 1 − α t ˉ ) μ ~ t ( x t , x 0 ) = α t ⋅ ( 1 − α t − 1 ˉ ) 1 − α t ˉ ⋅ x t + β t ⋅ α t − 1 ˉ 1 − α t ˉ ⋅ x 0 \begin{equation} \begin{split} \sigma_t&=\sqrt{\frac{\beta_t\cdot (1-\bar{\alpha_{t-1}})}{(1-\bar{\alpha_{t}})}}\\ \widetilde{\mu}_t(x_t,x_0)&=\frac{\sqrt{\alpha_t}\cdot(1-\bar{\alpha_{t-1}})}{1-\bar{\alpha_t}}\cdot x_t+\frac{\beta_t\cdot \sqrt{\bar{\alpha_{t-1}}}}{1-\bar{\alpha_t}} \cdot x_0 \\ \end{split} \end{equation} σtμ t(xt,x0)=(1αtˉ)βt(1αt1ˉ) =1αtˉαt (1αt1ˉ)xt+1αtˉβtαt1ˉ x0
在DDIM《Denoising Diffusion Implicit Models》中对符号进行了重新定义。具体来说使用 α t \alpha_t αt替换掉了 α ˉ t \bar\alpha_t αˉt,而在DDPM当中
α ˉ t = ∏ 0 t α i \begin{equation} \begin{split} \bar \alpha_t=\prod_{0}^{t}\alpha_i \end{split} \end{equation} αˉt=0tαi
因此,在DDIM中会发生一些变化,例如 β t \beta_t βt的改变如公式(3)所示。
β t = 1 − α t ( D D P M ) = 1 − α t α t − 1 ( D D I M ) \begin{equation} \begin{split} \beta_t&=1-\alpha_t (DDPM)\\ &=1-\frac{\alpha_t}{\alpha_{t-1}} (DDIM)\\ \end{split} \end{equation} βt=1αt(DDPM)=1αt1αt(DDIM)
前向加噪过程中的 q ( x t − 1 ∣ x t , x 0 ) q(x_{t-1}|x_t,x_0) q(xt1xt,x0)分布的方差和均值分别如公式(4)和(5)所示。
σ t 2 = 1 − α ˉ t − 1 1 − α t ˉ ⋅ β t ( D D P M ) = 1 − α t − 1 1 − α t ⋅ ( 1 − α t α t − 1 ) ( D D I M ) \begin{equation} \begin{split} \sigma_t^2&=\frac{1-\bar{\alpha}_{t-1}}{1-\bar{\alpha_t}}\cdot \beta_t(DDPM)\\ &=\frac{1-\alpha_{t-1}}{1-\alpha_t}\cdot (1-\frac{\alpha_t}{\alpha_{t-1}}) (DDIM) \end{split} \end{equation} σt2=1αtˉ1αˉt1βt(DDPM)=1αt1αt1(1αt1αt)(DDIM)
μ ~ t ( x t , x 0 ) = α t ⋅ ( 1 − α ˉ t − 1 ) 1 − α t ˉ ⋅ x t + β t ⋅ α ˉ t − 1 1 − α t ˉ ⋅ x 0 ( D D P M ) = α t ⋅ ( 1 − α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ⋅ x t + ( 1 − α t α t − 1 ) ⋅ α t − 1 1 − α t ⋅ x 0 ( D D I M ) = α t ⋅ ( 1 − α t − 1 ) 2 α t − 1 ⋅ ( 1 − α t ) 2 ⋅ x t + α t − 1 − α t α t − 1 ⋅ α t − 1 1 − α t ⋅ x 0 = 1 − α t − 1 1 − α t ⋅ α t − α t ⋅ α t − 1 α t − 1 − α t − 1 ⋅ α t ⋅ x t + α t − 1 − α t α t − 1 ⋅ ( 1 − α t ) ⋅ x 0 = 1 − α t − 1 1 − α t ⋅ α t + α t − 1 − α t − 1 − α t ⋅ α t − 1 α t − 1 − α t − 1 ⋅ α t ⋅ x t + α t − 1 − α t ⋅ α t − 1 + α t ⋅ α t − 1 − α t α t − 1 ⋅ ( 1 − α t ) ⋅ x 0 = 1 − α t − 1 1 − α t ⋅ ( 1 + α t − α t − 1 α t − 1 − α t − 1 ⋅ α t ) ⋅ x t + α t − 1 ⋅ ( 1 − α t ) − α t ⋅ ( 1 − α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ⋅ x 0 = 1 − α t − 1 1 − α t ⋅ ( 1 − α t − 1 − α t α t − 1 − α t − 1 ⋅ α t ) ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ] ⋅ x 0 = 1 1 − α t ⋅ ( 1 − α t − 1 − ( α t − 1 − α t ) ⋅ ( 1 − α t − 1 ) α t − 1 − α t − 1 ⋅ α t ) ⋅ x t + [ α t − 1 − α t 2 ⋅ ( 1 − α t − 1 ) 2 α t − 1 ⋅ ( 1 − α t ) ] ⋅ x 0 = 1 1 − α t ⋅ ( 1 − α t − 1 − ( α t − 1 − α t ) ⋅ ( 1 − α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ⏟ = σ t 2 ) ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 ) ⋅ ( α t − α t ⋅ α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ] ⋅ x 0 = 1 1 − α t ⋅ ( 1 − α t − 1 − σ t 2 ) ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 ) ⋅ ( α t + α t − 1 − α t − 1 − α t ⋅ α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ⋅ ( 1 − α t ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − 1 − α t − 1 1 − α t ⋅ α t ⋅ ( α t − α t − 1 + α t − 1 ⋅ ( 1 − α t ) ) α t − 1 ⋅ ( 1 − α t ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − 1 − α t − 1 1 − α t ⋅ α t ⋅ ( α t − α t − 1 + α t − 1 ⋅ ( 1 − α t ) ) α t − 1 ⋅ ( 1 − α t ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − 1 − α t − 1 1 − α t ⋅ α t ⋅ ( 1 + α t − α t − 1 α t − 1 ⋅ ( 1 − α t ) ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − 1 1 − α t ⋅ ( 1 − α t − 1 ) ⋅ α t ⋅ ( 1 − α t − 1 − α t α t − 1 ⋅ ( 1 − α t ) ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − 1 1 − α t ⋅ α t ⋅ ( 1 − α t − 1 − ( α t − 1 − α t ) ⋅ ( 1 − α t − 1 ) α t − 1 ⋅ ( 1 − α t ) ⏟ σ t 2 ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − 1 1 − α t ⋅ α t ⋅ ( 1 − α t − 1 − σ t 2 ) ] ⋅ x 0 = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 − σ t 2 ) 1 − α t ] ⋅ x 0 \begin{equation} \begin{split} \widetilde{\mu}_t(x_t,x_0)&=\frac{\sqrt{\alpha_t}\cdot(1-\bar\alpha_{t-1})}{1-\bar{\alpha_t}}\cdot x_t+\frac{\beta_t\cdot \sqrt{\bar\alpha_{t-1}}}{1-\bar{\alpha_t}} \cdot x_0 (DDPM)\\ &=\frac{\sqrt{\alpha_t}\cdot(1-\alpha_{t-1})}{\sqrt{\alpha_{t-1}}\cdot(1-\alpha_t)}\cdot x_t+(1-\frac{\alpha_t}{\alpha_{t-1}})\cdot\frac{\sqrt{\alpha_{t-1}}}{1-\alpha_t}\cdot x_0 (DDIM)\\ &= \sqrt{\frac{\alpha_t\cdot (1-\alpha_{t-1})^2}{\alpha_{t-1} \cdot (1-\alpha_t)^2}}\cdot x_t+\frac{\alpha_{t-1}-\alpha_t}{\alpha_{t-1}}\cdot\frac{\sqrt{\alpha_{t-1}}}{1-\alpha_t}\cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}}{1-\alpha_{t}}\cdot \frac{\alpha_t-\alpha_t \cdot \alpha_{t-1}}{\alpha_{t-1}-\alpha_{t-1}\cdot \alpha_{t}}} \cdot x_t+\frac{\alpha_{t-1}-\alpha_t}{\sqrt{ \alpha_{t-1}}\cdot (1-\alpha_t)}\cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}}{1-\alpha_{t}}\cdot \frac{\alpha_t+\alpha_{t-1}-\alpha_{t-1}-\alpha_t \cdot \alpha_{t-1}}{\alpha_{t-1}-\alpha_{t-1}\cdot \alpha_{t}}} \cdot x_t+\frac{\alpha_{t-1}-\alpha_t\cdot \alpha_{t-1}+\alpha_t\cdot \alpha_{t-1}-\alpha_t}{\sqrt{ \alpha_{t-1}}\cdot (1-\alpha_t)}\cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}}{1-\alpha_{t}}\cdot \Big(1+\frac{\alpha_t-\alpha_{t-1}}{\alpha_{t-1}-\alpha_{t-1}\cdot \alpha_{t}}\Big)}\cdot x_t+\frac{\alpha_{t-1}\cdot (1-\alpha_t)-\alpha_t\cdot (1-\alpha_{t-1})}{\sqrt{ \alpha_{t-1}}\cdot (1-\alpha_t)}\cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}}{1-\alpha_{t}}\cdot \Big(1-\frac{\alpha_{t-1}-\alpha_t}{\alpha_{t-1}-\alpha_{t-1}\cdot \alpha_{t}}\Big)}\cdot x_t+ \bigg[ \sqrt{\alpha_{t-1}}-\frac{\alpha_t\cdot (1-\alpha_{t-1})}{\sqrt{ \alpha_{t-1}}\cdot (1-\alpha_t)} \bigg] \cdot x_0 \\ &=\sqrt{\frac{1}{1-\alpha_{t}}\cdot \Big(1-\alpha_{t-1}-\frac{(\alpha_{t-1}-\alpha_t)\cdot (1-\alpha_{t-1})}{\alpha_{t-1}-\alpha_{t-1}\cdot \alpha_{t}}\Big)}\cdot x_t+ \bigg[ \sqrt{\alpha_{t-1}}-\frac{\sqrt{\alpha_t^2\cdot (1-\alpha_{t-1})^2}}{\sqrt{ \alpha_{t-1}}\cdot (1-\alpha_t)} \bigg] \cdot x_0 \\ &=\sqrt{\frac{1}{1-\alpha_{t}}\cdot \Big(1-\alpha_{t-1}-\underbrace{\frac{(\alpha_{t-1}-\alpha_t)\cdot (1-\alpha_{t-1})}{\alpha_{t-1}\cdot (1- \alpha_{t})}}_{=\sigma_t^2}\Big)}\cdot x_t+ \bigg[ \sqrt{\alpha_{t-1}}-\frac{\sqrt{\alpha_t\cdot (1-\alpha_{t-1})\cdot(\alpha_t-\alpha_t\cdot \alpha_{t-1})}}{\sqrt{ \alpha_{t-1}}\cdot (1-\alpha_t)} \bigg] \cdot x_0 \\ &=\sqrt{\frac{1}{1-\alpha_{t}}\cdot \Big(1-\alpha_{t-1}-\sigma_t^2 \Big)}\cdot x_t+ \bigg[ \sqrt{\alpha_{t-1}}-\frac{\sqrt{\alpha_t\cdot (1-\alpha_{t-1})\cdot(\alpha_t + \alpha_{t-1} -\alpha_{t-1}-\alpha_t\cdot \alpha_{t-1})}}{\sqrt{ \alpha_{t-1}\cdot(1-\alpha_t)}\cdot (\sqrt{1-\alpha_t})} \bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{1-\alpha_{t-1}}}{\sqrt{1-\alpha_t}} \cdot \frac{ \sqrt{ \alpha_t \cdot \big(\alpha_t-\alpha_{t-1}+\alpha_{t-1}\cdot(1-\alpha_t)\big)}}{\sqrt{ \alpha_{t-1}\cdot(1-\alpha_t)}} \bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{1-\alpha_{t-1}}}{\sqrt{1-\alpha_t}} \cdot \sqrt{ \frac{ \alpha_t \cdot \big(\alpha_t-\alpha_{t-1}+\alpha_{t-1}\cdot(1-\alpha_t)\big)}{\alpha_{t-1}\cdot(1-\alpha_t)}} \bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{1-\alpha_{t-1}}}{\sqrt{1-\alpha_t}} \cdot \sqrt{ \alpha_t\cdot \Big(1+\frac{ \alpha_t-\alpha_{t-1}}{\alpha_{t-1}\cdot(1-\alpha_t)}} \Big)\bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{1}{\sqrt{1-\alpha_t}} \cdot \sqrt{ (1-\alpha_{t-1}) \cdot \alpha_t\cdot \Big(1-\frac{\alpha_{t-1} - \alpha_t}{\alpha_{t-1}\cdot(1-\alpha_t)}} \Big)\bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{1}{\sqrt{1-\alpha_t}} \cdot \sqrt{ \alpha_t\cdot \Big(1-\alpha_{t-1}-\underbrace{ \frac{(\alpha_{t-1} - \alpha_t)\cdot (1-\alpha_{t-1})}{\alpha_{t-1}\cdot(1-\alpha_t)}}_{\sigma_t^2}} \Big)\bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{1}{\sqrt{1-\alpha_t}} \cdot \sqrt{ \alpha_t\cdot (1-\alpha_{t-1}-\sigma_t^2} )\bigg] \cdot x_0 \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{ \alpha_t\cdot (1-\alpha_{t-1}-\sigma_t^2} )}{\sqrt{1-\alpha_t}} \bigg] \cdot x_0 \\ \end{split} \end{equation} μ t(xt,x0)=1αtˉαt (1αˉt1)xt+1αtˉβtαˉt1 x0DDPM=αt1 (1αt)αt (1αt1)xt+(1αt1αt)1αtαt1 x0(DDIM)=αt1(1αt)2αt(1αt1)2 xt+αt1αt1αt1αtαt1 x0=1αt1αt1αt1αt1αtαtαtαt1 xt+αt1 (1αt)αt1αtx0=1αt1αt1αt1αt1αtαt+αt1αt1αtαt1 xt+αt1 (1αt)αt1αtαt1+αtαt1αtx0=1αt1αt1(1+αt1αt1αtαtαt1) xt+αt1 (1αt)αt1(1αt)αt(1αt1)x0=1αt1αt1(1αt1αt1αtαt1αt) xt+[αt1 αt1 (1αt)αt(1αt1)]x0=1αt1(1αt1αt1αt1αt(αt1αt)(1αt1)) xt+[αt1 αt1 (1αt)αt2(1αt1)2 ]x0=1αt1(1αt1=σt2 αt1(1αt)(αt1αt)(1αt1)) xt+[αt1 αt1 (1αt)αt(1αt1)(αtαtαt1) ]x0=1αt1(1αt1σt2) xt+[αt1 αt1(1αt) (1αt )αt(1αt1)(αt+αt1αt1αtαt1) ]x0=1αt1αt1σt2 xt+[αt1 1αt 1αt1 αt1(1αt) αt(αtαt1+αt1(1αt)) ]x0=1αt1αt1σt2 xt+[αt1 1αt 1αt1 αt1(1αt)αt(αtαt1+αt1(1αt)) ]x0=1αt1αt1σt2 xt+[αt1 1αt 1αt1 αt(1+αt1(1αt)αtαt1 )]x0=1αt1αt1σt2 xt+[αt1 1αt 1(1αt1)αt(1αt1(1αt)αt1αt )]x0=1αt1αt1σt2 xt+[αt1 1αt 1αt(1αt1σt2 αt1(1αt)(αt1αt)(1αt1) )]x0=1αt1αt1σt2 xt+[αt1 1αt 1αt(1αt1σt2 )]x0=1αt1αt1σt2 xt+[αt1 1αt αt(1αt1σt2 )]x0

因此,前向传播过程中的 q ( x t − 1 ∣ x t , x 0 ) ∼ N ( x t − 1 ; 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 − σ t 2 ) 1 − α t ] ⋅ x 0 , σ t 2 I ) q(x_{t-1}|x_t,x_0)\sim N(x_{t-1};\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{ \alpha_t\cdot (1-\alpha_{t-1}-\sigma_t^2} )}{\sqrt{1-\alpha_t}} \bigg] \cdot x_0,\sigma_t^2 I) q(xt1xt,x0)N(xt1;1αt1αt1σt2 xt+[αt1 1αt αt(1αt1σt2 )]x0,σt2I)

由于 x 0 = x t − 1 − α t ⋅ z t α t x_0=\frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\alpha_t} x0=αtxt1αt zt,将此式替换掉公式(5)中的 x 0 x_0 x0。得到

μ ~ t ( x t , x 0 ) = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 − σ t 2 ) 1 − α t ] ⋅ x 0 ⏟ x t − 1 − α t ⋅ z t α t = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + [ α t − 1 − α t ⋅ ( 1 − α t − 1 − σ t 2 ) 1 − α t ] ⋅ x t − 1 − α t ⋅ z t α t = 1 − α t − 1 − σ t 2 1 − α t ⋅ x t + α t − 1 ⋅ x t − 1 − α t ⋅ z t α t − α t ⋅ ( 1 − α t − 1 − σ t 2 ) 1 − α t ⋅ x t − 1 − α t ⋅ z t α t = α t − 1 ⋅ x t − 1 − α t ⋅ z t α t + 1 − α t − 1 − σ t 2 1 − α t ⋅ x t − 1 − α t − 1 − σ t 2 ⋅ ( x t − 1 − α t ⋅ z t ) 1 − α t = α t − 1 ⋅ x t − 1 − α t ⋅ z t α t + 1 − α t − 1 − σ t 2 ⋅ x t − 1 − α t − 1 − σ t 2 ⋅ x t + 1 − α t − 1 − σ t 2 ⋅ 1 − α t ⋅ z t 1 − α t = α t − 1 ⋅ x t − 1 − α t ⋅ z t α t + 1 − α t − 1 − σ t 2 ⋅ z t \begin{equation} \begin{split} \widetilde{\mu}_t(x_t,x_0)&=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{ \alpha_t\cdot (1-\alpha_{t-1}-\sigma_t^2} )}{\sqrt{1-\alpha_t}} \bigg] \cdot \underbrace{x_0}_{\frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}}} \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+ \bigg[\sqrt{\alpha_{t-1}}- \frac{\sqrt{ \alpha_t\cdot (1-\alpha_{t-1}-\sigma_t^2} )}{\sqrt{1-\alpha_t}} \bigg] \cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}} \\ &=\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t+\sqrt{\alpha_{t-1}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}}-\frac{\sqrt{ \alpha_t\cdot (1-\alpha_{t-1}-\sigma_t^2} )}{\sqrt{1-\alpha_t}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}} \\ &=\sqrt{\alpha_{t-1}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}}+\sqrt{\frac{1-\alpha_{t-1}-\sigma_t^2}{1-\alpha_{t}}}\cdot x_t -\frac{\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot(x_t-\sqrt{1-\alpha_t}\cdot z_t)}{\sqrt{1-\alpha_t}} \\ &=\sqrt{\alpha_{t-1}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}} + \frac{\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot x_t-\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot x_t+\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot \sqrt{1-\alpha_t}\cdot z_t}{\sqrt{1-\alpha_t}} \\ &=\sqrt{\alpha_{t-1}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}} +\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot z_t \end{split} \end{equation} μ t(xt,x0)=1αt1αt1σt2 xt+[αt1 1αt αt(1αt1σt2 )]αt xt1αt zt x0=1αt1αt1σt2 xt+[αt1 1αt αt(1αt1σt2 )]αt xt1αt zt=1αt1αt1σt2 xt+αt1 αt xt1αt zt1αt αt(1αt1σt2 )αt xt1αt zt=αt1 αt xt1αt zt+1αt1αt1σt2 xt1αt 1αt1σt2 (xt1αt zt)=αt1 αt xt1αt zt+1αt 1αt1σt2 xt1αt1σt2 xt+1αt1σt2 1αt zt=αt1 αt xt1αt zt+1αt1σt2 zt
结合公式(6)和公式(4),总结出后向去噪过程 p ( x t − 1 ∣ x t , x 0 ) p(x_{t-1}|x_t,x_0) p(xt1xt,x0)的均值和方差如公式 (7)所示
μ ~ t ( x t , x 0 ) = α t − 1 ⋅ x t − 1 − α t ⋅ z t α t + 1 − α t − 1 − σ t 2 ⋅ z t σ t 2 = 1 − α t − 1 1 − α t ⋅ ( 1 − α t α t − 1 ) \begin{equation} \begin{split} \widetilde{\mu}_t(x_t,x_0)&=\sqrt{\alpha_{t-1}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}} +\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot z_t \\ \sigma_t^2&=\frac{1-\alpha_{t-1}}{1-\alpha_t}\cdot (1-\frac{\alpha_t}{\alpha_{t-1}}) \end{split} \end{equation} μ t(xt,x0)σt2=αt1 αt xt1αt zt+1αt1σt2 zt=1αt1αt1(1αt1αt)

因此,
x t − 1 = μ ~ t ( x t , x 0 ) + σ t ⋅ ϵ = α t − 1 ⋅ x t − 1 − α t ⋅ z t α t + 1 − α t − 1 − σ t 2 ⋅ z t + σ t ⋅ ϵ \begin{equation} \begin{split} x_{t-1}&=\widetilde{\mu}_t(x_t,x_0)+\sigma_t\cdot \epsilon \\ &=\sqrt{\alpha_{t-1}}\cdot \frac{x_t-\sqrt{1-\alpha_t}\cdot z_t}{\sqrt{\alpha_t}} +\sqrt{1-\alpha_{t-1}-\sigma_t^2}\cdot z_t +\sigma_t\cdot \epsilon \end{split} \end{equation} xt1=μ t(xt,x0)+σtϵ=αt1 αt xt1αt zt+1αt1σt2 zt+σtϵ
式中, z t z_t zt为在由 x 0 → x t x_0\to x_t x0xt过程中添加的噪音,为标准高斯分布,这个分布是带参 θ \theta θ的模型需要预测的。由于 x t − 1 x_{t-1} xt1是一个高斯分布(非标准高斯分布),因此其表达式中需要添加 ϵ \epsilon ϵ,而 ϵ \epsilon ϵ为标准高斯分布。值得注意的是,这个公式(8)就是DDIM论文中的公式 (12)。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值