GAN[NIPS. 2014]

最新推荐文章于 2025-01-15 10:32:32 发布

Ah丶Weii

最新推荐文章于 2025-01-15 10:32:32 发布

阅读量153

点赞数

CC 4.0 BY-SA版权

分类专栏：笔记

本文链接：https://blog.youkuaiyun.com/weixin_43823854/article/details/118157177

笔记专栏收录该内容

44 篇文章

订阅专栏

本文深入探讨了KL散度和JS散度两种概率分布差异度量方式，阐述了它们的数学定义及其在机器学习中的作用。通过最大化变分下界，解释了GANs（生成对抗网络）中D*和G*的关系，揭示了生成模型如何通过最小化KL散度来逼近真实数据分布。同时，讨论了Jensen-Shannon散度作为非负且仅在分布相同时为零的特性，及其在评估分布相似性中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

KL divergence，如果KL的值越大，代表2个分布之间的差异越大，KL的值越小，代表2个分布之间的差异最小。

KL divergence:
$D_{KL}(P||Q) = \sum^N_{i=1} P(x_i)log\frac{P(x_i)}{Q(x_i)}$
JS divergence:
$\frac{1}{2}D(P||M)+ \frac{1}{2}D(Q||M) \\ M = \frac{1}{2}(P+Q)$

$G^* = arg\min_{G}Div(P_G, P_{data}) \\ x = G(z) \\ P_G(x)$

$\begin{aligned} D^* &=arg \max_D V(D,G) \\&= \frac{P_{data(x)}}{P_{data}(x)+ P_G(x)} \end{aligned}$
when G is fixed:
$\begin{aligned} \max_D V(G, D)&=V(G,D^*) \\&=E_{x \sim P_{data}} \log(D^*(x)) + E_{x \sim P_G}\log(1-D^*(x)) \\ &= -2log2 +2JSD(P_{data}||P_G) \end{aligned}$

Since the Jensen–Shannon divergence between two distributions is always non-negative and zero only when they are equal, we have shown that $C^∗ = −log(4)$ is the global minimum of C(G) and that the only solution is $p_g = p_{data}$ , i.e., the generative model perfectly replicating the data generating process
证明出 $V (D, G)$ 和 $div(P_{data}, P_G)$ 是有关系的。