Lecture 13: Generative Models

最新推荐文章于 2024-11-18 11:46:51 发布

qq_36356761

最新推荐文章于 2024-11-18 11:46:51 发布

阅读量228

点赞数

分类专栏： CS231n

本文链接：https://blog.youkuaiyun.com/qq_36356761/article/details/80232198

版权

CS231n 专栏收录该内容

14 篇文章

订阅专栏

CS231n

Lecture 13: Generative Models

Unsupervised Learning

相比于从有标注的训练数据中学习 $f:x\mapsto y$ 的有监督学习，无监督学习旨在学习无标注数据的隐含结构，包括聚类(K-means)、降维(PCA)、特征学习(Auto-encode)、密度估计等

Generative Models: Given training data, generate new samples from same distribution
实际上是一个密度估计问题（学习 $p_\mathrm{model}(x) \sim p_\mathrm{data}(x)$ ），包括两种方式

显式：显式定义并求解 $p_\mathrm{model}(x)$
隐式：学习一个依 $p_\mathrm{model}(x)$ 采样的模型而不显示定义它

应用

artwork, super-resolution, colorization, etc
Generative models of time-series data can be used for simulation and planning (reinforcement learning applications!)
Training generative models can also enable inference of latent representations that can be useful as general features（这是个好思路）

分类

Generative models ⎧ ⎩ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ explicit density ⎧ ⎩ ⎨ ⎪ ⎪ Tractable density:Pixel RNN/CNN approximate density {Variational:Variational Auto-Encoder Markov chain: Boltzmann Machine implicit density {Direct: GAN Markov chain: GSN

$\text{Generative models}\begin{cases} \text{explicit density}\begin{cases} \text{Tractable density:Pixel RNN/CNN}\\ \text{approximate density}\begin{cases} \text{Variational:Variational Auto-Encoder}\\\text{Markov chain: Boltzmann Machine} \end{cases}\end{cases}\\\text{implicit density} \begin{cases}\text{Direct: GAN}\\\text{Markov chain: GSN}\end{cases}\end{cases}$

PixelRNN

从图像的左上开始逐步遍历所有像素点，用RNN建模其中的依赖关系

PixelCNN

同PixelRNN只不过用CNN建模依赖关系
Training is faster than PixelRNN
显式建模的优势

explicitly compute likelihood
p(x)
Explicit likelihood of training
data gives good evaluation
metric
Good samples

缺点：Sequential generation => slow

Variational Auto-Encoder

Autoencoder: $x\xrightarrow[]{\text{encoder}}z\xrightarrow[]{\text{decoder}}\hat{x}, L(x) = \lVert x - \hat{x}\rVert^2$ , learning a lower-dimensional feature representation $z$ from unlabeled training data $x$ . After training, throw away decoder, Encoder can be used to initialize a supervised model
Try generating new images from an autoencoder $\Rightarrow$ VAE

z \sim p θ * (z) - \to - - VAE x \sim p θ * (x | z)

$z\sim p_{\theta^*}(z)\xrightarrow[]{\text{VAE}}x\sim p_{\theta^*}(x|z)$

Choose prior $p(z)$ to be simple, e.g. Gaussian $\Rightarrow p(z)\sim N(0,1)$
Conditional $p(x|z)$ is complex (generates image) $\Rightarrow$ represent with neural network

Train
理论上 $p(z)=\int p_{\theta}(z)p_{\theta}(x|z)\mathrm{d}z$ ，但是没法对每一个 $z$ 求解相应的 $p(x|z)$ ，即使使用Bayes公式转换成后验概率也无法求解
解决方案：在VAE decoder模型的基础上再定义一个encoder $q_\phi (z|x)$ 以逼近 $p_\theta(z|x)$