欢迎fork我的github:https://github.com/zhaoyu611/DeepLearningTutorialForChinese
最近在学习Git,所以正好趁这个机会,把学习到的知识实践一下~ 看完DeepLearning的原理,有了大体的了解,但是对于theano的代码,还是自己撸一遍印象更深。
注意:本节内容认为读者已经阅读之前的 Classifying MNIST digits using Logistic Regression 和 Multilayer Perceptron章节。此外,本节用到以下Theano函数和概念:T.tanh, shared variables, basic arithmetic ops, T.grad, Random numbers, floatX. 如果你想在GPU上运行,请查看GPU。
注意: 本节代码下载地址:http://deeplearning.net/tutorial/code/SdA.py
栈式降噪自编码(The Stacked Denoising Autoencoder, SdA)是栈式自编码的扩展,首先有Vincent提出的。
本教程以之前的降噪自编码为基础,如果您不熟悉编码,我们建议您先阅读之前的章节。
栈式自编码
将降噪自编码器进行堆栈组成深度网络:将下层的输出量作为上层输入量。该结构的无监督预训练是逐层进行的。每层作为降噪自编码器进行训练,目标是输入量(上层的输出量)的最小化重构误差。当完成前 k 层训练,就可以进行第
完成所有层的预训练之后,网络进行第二步训练:微调。这里进行有监督微调,因为我们期望在有监督的测试中最小化预测误差。首先在网络的顶层增加一个逻辑回归层(输出层的结果更加精确)。然后使用训练多感知器的方法训练整个网络。此时,我们只考虑每个自编码器中的编码部分。这一步是有监督的,所以使用目标类进行训练。(更多细节参见Multilayer Perceptron )
Theano中有很多降噪自编码的类,所以上述逻辑很容易实现。栈式降噪自编码可以看做由两方面组成:一系列的自编码器和一个多感知器。预训练时,使用第一个方面:将模型看做一系列的自编码器,分别训练每个自编码器。在训练的第二阶段,使用第二个方面。两方面是相互关联的,原因如下:
- 多感知器中编码层和sigmoid层参数共享
- 多感知器的中间层的输出是自编码器的输入
class SdA(object):
"""Stacked denoising auto-encoder class (SdA)
A stacked denoising autoencoder model is obtained by stacking several
dAs. The hidden layer of the dA at layer `i` becomes the input of
the dA at layer `i+1`. The first layer dA gets as input the input of
the SdA, and the hidden layer of the last dA represents the output.
Note that after pretraining, the SdA is dealt with as a normal MLP,
the dAs are only used to initialize the weights.
"""
def __init__(
self,
numpy_rng,
theano_rng=None,
n_ins=784,
hidden_layers_sizes=[500, 500],
n_outs=10,
corruption_levels=[0.1, 0.1]
):
""" This class is made to support a variable number of layers.
:type numpy_rng: numpy.random.RandomState
:param numpy_rng: numpy random number generator used to draw initial
weights
:type theano_rng: theano.tensor.shared_randomstreams.RandomStreams
:param theano_rng: Theano random generator; if None is given one is
generated based on a seed drawn from `rng`
:type n_ins: int
:param n_ins: dimension of the input to the sdA
:type hidden_layers_sizes: list of ints
:param hidden_layers_sizes: intermediate layers size, must contain
at least one value
:type n_outs: int
:param n_outs: dimension of the output of the network
:type corruption_levels: list of float
:param corruption_levels: amount of corruption to use for each
layer
"""
self.sigmoid_layers = []
self.dA_layers = []
self.params = []
self.n_layers = len(hidden_layers_sizes)
asse