CS231n
Lecture 13: Generative Models
Unsupervised Learning
相比于从有标注的训练数据中学习 f:x↦y f : x ↦ y 的有监督学习,无监督学习旨在学习无标注数据的隐含结构,包括聚类(K-means)、降维(PCA)、特征学习(Auto-encode)、密度估计等
Generative Models: Given training data, generate new samples from same distribution
实际上是一个密度估计问题(学习
pmodel(x)∼pdata(x)
p
m
o
d
e
l
(
x
)
∼
p
d
a
t
a
(
x
)
),包括两种方式
- 显式:显式定义并求解 pmodel(x) p m o d e l ( x )
- 隐式:学习一个依 pmodel(x) p m o d e l ( x ) 采样的模型而不显示定义它
应用
- artwork, super-resolution, colorization, etc
- Generative models of time-series data can be used for simulation and planning (reinforcement learning applications!)
- Training generative models can also enable inference of latent representations that can be useful as general features(这是个好思路)
分类
PixelRNN
从图像的左上开始逐步遍历所有像素点,用RNN建模其中的依赖关系
PixelCNN
同PixelRNN只不过用CNN建模依赖关系
Training is faster than PixelRNN
显式建模的优势
- explicitly compute likelihood
p(x) - Explicit likelihood of training
data gives good evaluation
metric - Good samples
缺点:Sequential generation => slow
Variational Auto-Encoder
Autoencoder:
x−→−−−encoderz−→−−−decoderx^,L(x)=∥x−x^∥2
x
→
encoder
z
→
decoder
x
^
,
L
(
x
)
=
‖
x
−
x
^
‖
2
, learning a lower-dimensional feature representation
z
z
from unlabeled training data . After training, throw away decoder, Encoder can be used to initialize a supervised model
Try generating new images from an autoencoder
⇒
⇒
VAE
- Choose prior p(z) p ( z ) to be simple, e.g. Gaussian ⇒p(z)∼N(0,1) ⇒ p ( z ) ∼ N ( 0 , 1 )
- Conditional p(x|z) p ( x | z ) is complex (generates image) ⇒ ⇒ represent with neural network
Train
理论上
p(z)=∫pθ(z)pθ(x|z)dz
p
(
z
)
=
∫
p
θ
(
z
)
p
θ
(
x
|
z
)
d
z
,但是没法对每一个
z
z
求解相应的,即使使用Bayes公式转换成后验概率也无法求解
解决方案:在VAE decoder模型的基础上再定义一个encoder
qϕ(z|x)
q
ϕ
(
z
|
x
)
以逼近
pθ(z|x)
p
θ
(
z
|
x
)