实现一个自编码器_自编码器的设计-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_42522635/article/details/95724131

本文介绍了自编码器，它是一种输入输出为x且输入维度大于输出维度的神经网络，属于无监督学习。自编码器能从数据样本中无监督学习，但在图像压缩方面表现欠佳。其架构由编码器和解码器组成，应用包括数据去噪、可视化降维、图像压缩等。

自编码器
基本意思就是一个隐藏层的神经网络，输入输出都是x，并且输入维度一定要比输出维度大，属于无监督学习。一种利用反向传播算法使得输出值等于输入值的神经网络，它先将输入压缩成潜在空间表征，然后通过这种表征来重构输出。
自编码器的理解
自编码器能从数据样本中进行无监督学习，这意味着可将这个算法应用到某个数据集中，来取得良好的性能，且不需要任何新的特征工程，只需要适当地训练数据。
但是，自编码器在图像压缩方面表现得不好。由于在某个给定数据集上训练自编码器，因此它在处理与训练集相类似的数据时可达到合理的压缩结果，但是在压缩差异较大的其他图像时效果不佳。这里，像JPEG这样的压缩技术在通用图像压缩方面会表现得更好。

训练自编码器，可以使输入通过编码器和解码器后，保留尽可能多的信息，但也可以训练自编码器来使新表征具有多种不同的属性。不同类型的自编码器旨在实现不同类型的属性。
通过施加不同约束，包括缩小隐含层的维度和加入惩罚项，使每种自编码器都具有不同属性。自编码器吸引了一大批研究和关注的主要原因之一是很长时间一段以来它被认为是解决无监督学习的可能方案，即大家觉得自编码器可以在没有标签的时候学习到数据的有用表达。
再说一次，自编码器并不是一个真正的无监督学习的算法，而是一个自监督的算法。自监督学习是监督学习的一个实例，其标签产生自输入数据。要获得一个自监督的模型，你需要想出一个靠谱的目标跟一个损失函数，问题来了，仅仅把目标设定为重构输入可能不是正确的选项。

基本上，要求模型在像素级上精确重构输入不是机器学习的兴趣所在，学习到高级的抽象特征才是。
事实上，当你的主要任务是分类、定位之类的任务时，那些对这类任务而言的最好的特征基本上都是重构输入时的最差的那种特征。
自编码器的架构
自编码器由两部分组成：
1）编码器：这部分能将输入压缩成潜在空间表征，可以用编码函数h=f(x)表示。
2）解码器：这部分能重构来自潜在空间表征的输入，可以用解码函数r=g(h)表示。
在这里插入图片描述 自编码器应用
第一是数据去噪。
第二是为进行可视化而降维。
第三是进行图像压缩。
第四传统自编码器被用于降维或特征学习。
代码实现

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
import numpy as np
 
 
###read the mnist data
 
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
 
 
# Parameter
learning_rate = 0.01
training_epochs = 100 # 100组训练
batch_size = 256
display_step = 1
examples_to_show = 10
 
# Network Parameters
n_input = 784  # MNIST data input (img shape: 28*28)
 
###Placeholders
X = tf.placeholder(tf.float32, [None, 784])
 
 
#hidden layer settings
n_hidden_1=256 #1st hidden layer output num
n_hidden_2=128 #2st hidden layer output num
weights = {
    'encoder_h1':tf.Variable(tf.random_normal([n_input,n_hidden_1])),
    'encoder_h2':tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2])),
    'decoder_h1':tf.Variable(tf.random_normal([n_hidden_2,n_hidden_1])),
    'decoder_h2':tf.Variable(tf.random_normal([n_hidden_1,n_input])),
    }
biases = {
    'encoder_b1':tf.Variable(tf.random_normal([n_hidden_1])),
    'encoder_b2':tf.Variable(tf.random_normal([n_hidden_2])),
    'decoder_b1':tf.Variable(tf.random_normal([n_hidden_1])),
    'decoder_b2':tf.Variable(tf.random_normal([n_input])),
    }
#下面来定义 Encoder 和 Decoder ，使用的 Activation function 是 sigmoid，
#压缩之后的值应该在 [0,1] 这个范围内。
#在 decoder 过程中，通常使用对应于 encoder 的 Activation function
def encoder(x):
    layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x,weights['encoder_h1']),biases['encoder_b1']))
    layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1,weights['encoder_h2']),
                                   biases['encoder_b2']))
    return layer_2
def decoder(x):
    layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x,weights['decoder_h1']),biases['decoder_b1']))
    layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1,weights['decoder_h2']),
                                   biases['decoder_b2']))
    return layer_2
 
 
#实现 Encoder 和 Decoder 输出的结果：
encoder_op = encoder(X)
decoder_op = decoder(encoder_op)
#Verification
y_ver = decoder_op #copy 'decoder_op' to 'y_ver' for verification
y = X #'y' is the expected output, actrually is 'X' cause unsupervised learning.
 
#calculate the COST of 'y_ver'&'y'.
cost = tf.losses.mean_squared_error(y,y_ver)
##cost = tf.reduce_mean(tf.pow(y,y_ver,2))#tf.pow幂次方,后面是2代表平方和 Wrong!! 
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
 
#最后，通过 Matplotlib 的 pyplot 模块将结果显示出来，
#注意在输出时MNIST数据集经过压缩之后 x 的最大值是1，而非255：
 
# Launch the graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())#########The newest initial centence!!!!!!!!!
    total_batch = int(mnist.train.num_examples/batch_size)#get the 'total_batch'from mnist
    # Training cycle
    for epoch in range(training_epochs):
        #Loop over all batches
        for i in range(total_batch):
            batch_xs,batch_ys = mnist.train.next_batch(batch_size) #get data from mnist
            #sess.run(cost)+sess.run(optimizer)  may work as well
            _,c = sess.run([optimizer,cost],feed_dict={X:batch_xs})#'c'is cost return-value
        #Display logs per epoch step
        if epoch % display_step == 0:
            print("Epoch:",'%04d'%(epoch+1),"cost=","{:.9f}".format(c))
 
    print("Optimization Finished!")
 
        ## Applying encode and decode over test set
    encode_decode = sess.run(y_ver,feed_dict={X:mnist.test.images[:examples_to_show]})
    y_m = sess.run(encoder_op,feed_dict={X:mnist.test.images[:examples_to_show]})#y_m <= encoder_op
    
    ##Compare original images with their reconstructions
    f,a = plt.subplots(3,10,figsize=(10,2))#'figsize':the size of images???
    for i in range(examples_to_show):
        a[0][i].imshow(np.reshape(mnist.test.images[i],(28,28)))
        a[1][i].imshow(np.reshape(y_m[i],(8,16)))#display encoder val
        a[2][i].imshow(np.reshape(encode_decode[i],(28,28)))
    plt.show()