Tensorflow学习之实现多层感知机

最新推荐文章于 2023-11-24 14:57:10 发布

丶Minskyli

最新推荐文章于 2023-11-24 14:57:10 发布

阅读量1.3k

点赞数

CC 4.0 BY-SA版权

分类专栏：深度学习文章标签：深度学习

本文链接：https://blog.youkuaiyun.com/qq_31531635/article/details/76168014

深度学习专栏收录该内容

35 篇文章

订阅专栏

本文介绍如何使用TensorFlow构建多层感知机，并通过Dropout解决过拟合问题，利用ReLU激活函数缓解梯度消失问题，最终在MNIST数据集上达到97.8%的准确率。

深度学习之Tensorflow实现多层感知机
为了拟合复杂函数需要的隐含节点的数目，基本上随着隐含层的数量增多呈指数下降趋势，也就是说层数越多，神经网络所需要的隐含节点可以越少。
常常为了解决过拟合，可以利用Dropout的方法，即在训练时，将神经网络某一层的输出节点数据随机丢弃一部分。这种做法实质上等于创造出了很多新的随机样本，通过增大样本量、减少特征数量来防止过拟合。
梯度弥散是另一个影响深层神经网络训练的问题，利用ReLU激活函数比较完美的解决了梯度弥散的问题，非常类似于人脑的阈值响应机制。

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
mnist=input_data.read_data_sets("MNIST_data/",one_hot=True)
sess=tf.InteractiveSession()
#   给隐含层的参数设置Variable并进行初始化，这里in_units是输入节点数，h1_units即隐含层的输出节点数设为300。
#   由于模型使用的激活函数是ReLU，所以需要使用正态分布给参数加一点噪声，来打破完全对称并且避免0梯度。
in_units=784
h1_units=300
W1=tf.Variable(tf.truncated_normal([in_units,h1_units],stddev=0.1))
b1=tf.Variable(tf.zeros([h1_units]))
W2=tf.Variable(tf.zeros([h1_units,10]))
b2=tf.Variable(tf.zeros([10]))
#   接下来定义输入x的占位符，在训练和预测时，Dropout的比例keep_prob（即保留节点的概率）是不一样的，训练时小于1，预测时等于1
x=tf.placeholder(tf.float32,[None,in_units])
keep_prob=tf.placeholder(tf.float32)
#   接下来定义模型结构，首先定义一个命名为hidden1的实现一个激活函数为ReLu的隐含层，接下来实现Dropout功能，最后是softmax输出层。
hidden1=tf.nn.relu(tf.matmul(x,W1) + b1)
hidden1_drop=tf.nn.dropout(hidden1,keep_prob)
y=tf.nn.softmax(tf.matmul(hidden1_drop,W2)+b2)
#   接下来定义损失函数和选择优化器来优化loss，这里的损失函数继续使用交叉信息熵，优化器选择自适应的优化器Adagrad，学习速率设为3。
y_=tf.placeholder(tf.float32,[None,10])
cross_entropy=tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y),reduction_indices=[1]))
train_step=tf.train.AdagradOptimizer(0.3).minimize(cross_entropy)
#   训练步骤
tf.global_variables_initializer().run()
for i in range(3000):
    batch_xs,batch_ys=mnist.train.next_batch(100)
    train_step.run({x:batch_xs,y_:batch_ys,keep_prob:0.75})
#   对模型进行准确率评测
correct_prediction=tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print(accuracy.eval({x:mnist.test.images,y_:mnist.test.labels,keep_prob:1.0}))

结果：

/usr/local/Cellar/anaconda/bin/python /Users/new/Documents/JLIFE/procedure/python_tr/py_train/train1.py
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
2017-07-27 00:42:42.802100: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-27 00:42:42.802121: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-07-27 00:42:42.802126: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-07-27 00:42:42.802132: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
0.978

Process finished with exit code 0