Understanding Diffusion Model[下]

5. Three Equivalent Interpretations

如之前证明所示,可以通过简单的学习神经网络来训练变分扩散模型,可以从任意噪声 x t x_t xt以及其时间索引 t t t中预测出原始自然图像 x 0 x_0 x0。然而, x 0 x_0 x0具有其他两种等效参数化,这导致了对VDM的两种进一步的解释。

首先,我们可以使用重参数技巧。在我们推导的形式 q ( x t ∣ x 0 ) q(x_t \mid x_0) q(xtx0)中,我们可以重新排列公式69以表明:
x 0 = x t − 1 − α ˉ t ϵ 0 α ˉ t ( 115 ) x_0 = \frac{x_t - \sqrt{1-\bar \alpha_t} \epsilon_0}{\sqrt{\bar \alpha_t}} \qquad (115) x0=αˉt xt1αˉt ϵ0(115)
将其插入到我们先前推导出的gt去噪转移步骤的均值 μ q ( x t , x 0 ) \mu_q(x_t,x_0) μq(xt,x0)中,我们可以重新推导为:

在这里插入图片描述

因此,我们可以设置我们近似去噪步骤的均值 μ θ ( x t , t ) \mu_\theta(x_t,t) μθ(xt,t)为:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-QYchqZgk-1669026624888)(img\image-20221115102356396-16684790378243.png)]

于是,相关的优化问题就变成了:

在这里插入图片描述

其中, ϵ ^ θ ( x t , t ) \hat \epsilon_\theta(x_t,t) ϵ^θ(xt,t)是一个神经网络,用于学习去预测从 x 0 x_0 x0确定 x t x_t xt的源噪声 ϵ 0 ∼ N ( ϵ ; 0 , I ) \epsilon_0 \sim \mathcal N(\epsilon;0,I) ϵ0N(ϵ;0,I)。因此,我们已经表明,通过预测原始图像 x 0 x_0 x0来学习VDM等同于学习预测噪声;然而从经验上看,一些工作发现预测噪声会带来更好的性能。 ϵ 0 \epsilon_0 ϵ0是第一次向原始图像 x 0 x_0 x0添加的噪声。

为了推导变分扩散模型的第三种常见解释,我们求助于Tweedie公式。该公式指出,给定从指数族分布中提取的样本,指数族分布的真实平均值可以通过从样本的最大似然估计(也称为经验平均值)加上涉及估计得分的一些校正项来估计。在只有一个观察样本的情况下,经验平均值只是样本本身,它通常用于减轻样本偏差;如果观察到的样本都位于潜在分布的一端,则负分数变大,并将样本的原始最大似然估计修正为真实平均值。

数学上,对于一个高斯变量 z ∼ N ( z ; μ z , Σ z ) z \sim \mathcal N(z;\mu_z,\Sigma_z) zN(z;μz,Σz),Tweedie公式声明:
E [ μ z ∣ z ] = z + Σ z ∇ z l o g p ( z ) \mathbb E[\mu_z \mid z] = z+ \Sigma_z \nabla_zlogp(z) E[μzz]=z+Σzzlogp(z)
在这种情况下,我们使用该公式去预测给定样本的 x t x_t xt的真实后验均值。对于公式70,我们可以知道:
q ( x t ∣ x 0 ) = N ( x t ; α ˉ t x 0 , ( 1 − α t ) I ) q(x_t \mid x_0) = \mathcal N(x_t;\sqrt{\bar \alpha_t}x_0,(1-\alpha_t)I) q(xtx0)=N(xt;αˉt x0,(1αt)I)
然后,使用Tweedie公式,我们可以得到:
E [ μ x t ∣ x t ] = x t + ( 1 − α t ) ∇ x t l o g p ( x t ) ( 131 ) \mathbb E[\mu_{x_t} \mid x_t] = x_t + (1-\alpha_t) \nabla_{x_t}logp(x_t) \qquad (131) E[μxtxt]=xt+(1αt)xtlogp(xt)(131)
为了记号的简便,我们将 ∇ x t l o g p ( x t ) \nabla_{x_t}logp(x_t)

### Diffusion Vector in Computer Science and Technology In the context of computer science and technology, diffusion vectors play an important role particularly within machine learning models that deal with graph-based data or spatio-temporal dynamics. A diffusion vector represents how information propagates through a network over time, capturing both spatial and temporal dependencies. For instance, when applied to trajectory analysis as described in research related to Similar Trajectory Search with Spatio-Temporal Deep Representation Learning[^1], diffusion vectors can model transitions between different locations at various times. This allows for more accurate prediction and understanding of movement patterns by considering not just where entities have been but also how they move from one place to another over periods. Moreover, while quantum channels focus on transmitting unknown states securely using qubits sent via specific endpoints like sender and receiver[^2], classical computing approaches involving diffusion vectors emphasize modeling probabilistic flows across networks composed of nodes connected by edges representing relationships such as distances or interactions among objects/entities. Diffusion processes modeled by these vectors are widely used in applications ranging from social media trend forecasting to epidemiological studies predicting disease spread rates based upon contact graphs constructed out of population mobility datasets. ```python import numpy as np def calculate_diffusion_vector(adj_matrix, initial_state): """ Calculate the diffusion vector given adjacency matrix and initial state. Parameters: adj_matrix (np.array): Adjacency matrix describing connections between nodes. initial_state (list/np.array): Initial distribution of values over all nodes. Returns: list: Final diffusion vector after propagation has stabilized. """ diff_vec = np.array(initial_state).astype(float) prev_vec = None # Iterate until convergence is reached while True: next_vec = np.dot(adj_matrix, diff_vec) if np.allclose(next_vec, diff_vec) or np.allclose(prev_vec, next_vec): break prev_vec = diff_vec.copy() diff_vec = next_vec return diff_vec.tolist() # Example usage adj_matrix_example = [[0, 1, 0], [1, 0, 1], [0, 1, 0]] initial_distribution = [1, 0, 0] final_diffusion_vector = calculate_diffusion_vector(adj_matrix_example, initial_distribution) print(final_diffusion_vector) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值