GAN史上最全基础入门总结

最新推荐文章于 2025-09-30 16:03:18 发布

原创

最新推荐文章于 2025-09-30 16:03:18 发布 · 1.8k 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #机器学习 #算法

本文是GAN（生成对抗网络）的全面入门总结，涵盖GAN的动机、基本原理、评估方法、发展历程及应用。从原始GAN到DCGAN、WGAN、SN-GAN等改进版，深入探讨了GAN的训练技巧、模式塌陷问题及其解决方案。此外，还讨论了条件生成如CycleGAN和无监督特征提取方法如InfoGAN。

阅读提醒：中英文混杂

1. Introduction to GAN

1.1 Motivation

Generative models:

explicit models: Likelihood-based models ( autoregressive and flows/VAE)
implicit models: sample z → sample x, learning the deep neural network without explicit density estimation

1.2 GAN (original GAN) [Goodfellow, NIPS, 2014]

G captures the data distribution, D estimates the divergence between $p_{data}$ and $p_G$ .
$min_Gmax_D V(G,D) \\ V(G,D) = \mathbb{E}_{x\sim p_{data}}[\log D(x)] + \mathbb{E}_{z\sim p_{z}}[\log (1-D(G(z)))]$
D尽力区别原始数据与生成数据的区别，形成一个二分类器；G给D提供负样本，并且尽力期骗D使D犯错。

gan

# last layer of D is nn.Sigmoid()
criterion=nn.BCELoss()
# Discriminator
f_loss = criterion(netD(fake_img.detach()), f_l)
r_loss = criterion(netD(real_img.detach()), r_l)
D_loss = (f_loss+r_loss)/2
# Generator
G_loss = criterion(netD(fake_img), r_l)  # 注意用的是real的label

Limitations:

unstable convergence
vanishing gradient
more collapse

1.3 Evaluation

Parzen-Window density estimator (Kernel density estimator)

只适用于低维
could be unreliable

Inception Score (IS)

$\exp(H(y)-H(y|x))$

IS 越大越好：希望 $H (y)$ 越大越好，表明生成图片的种类越多；希望 $H (y ∣ x)$ 越小越好，表明生成图片 $x$ 后其类别确定，即能够产生被分类的real image.
IS 没有充分度量diversity

Frechet Inception Distance (FID)

FID 越小越好

1.4 GAN theory

1.4.1 Bayes-Optimal Discriminator

用D衡量divergence，小的divergence是使discriminator很难分辨的东西，这个divergence不必显示表达，而是用一个NN来实现，这个NN就是D。
$D^*=argmax_DV(G,D) \\ D^*(x) = \frac{p_{data}(x)}{p_{data}(x)+p_{G}(x)}\\ max V(G,D) = V(G,D^*)=-2\log2+2JSD(p_{data}||p_G) \\ G^*=argmin_Gmax_D V(G,D) = argmin_G Div(p_G,p_{data})$