卷积神经网络CNN之实践样例

最新推荐文章于 2023-05-19 09:46:22 发布

yz930618

最新推荐文章于 2023-05-19 09:46:22 发布

阅读量1.1k

点赞数 3

CC 4.0 BY-SA版权

分类专栏： CNN 文章标签： cnn python 深度学习

本文链接：https://blog.youkuaiyun.com/yz930618/article/details/76610227

CNN 专栏收录该内容

2 篇文章

订阅专栏

本文通过一个实践样例介绍了如何使用卷积神经网络(CNN)进行手写数字识别，并详细阐述了从数据准备到模型构建及训练的全过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本文描述了CNN的一个实践样例，用于加深对CNN的理解。用到了minst数据库，通过训练CNN网络，实现手写数字的预测。

先导入原始minst数据：

mnist = input_data.read_data_sets('MNIST_data',one_hot=True)

下载下来的数据集被分成两部分：60000行的训练数据集（mnist.train）和10000行的测试数据集（mnist.test）。每一张图片包含28像素X28像素。”one_hot=True”表示对标签做onehot变换。例如在n为4的情况下，标签2对应的onehot标签就是 0.0 0.0 1.0 0.0

然后利用占位符定义输入数据

# 784就是一张展平的图片（28*28=784）,None表示输入图片的数量不定。
xs = tf.placeholder(tf.float32,[None,784]) #28*28

# 数字类别是0-9总共10个类别。
ys = tf.placeholder(tf.float32,[None,10])

# keep_prob是保留概率，即我们要保留的RELU的结果所占比例
keep_prob = tf.placeholder(tf.float32)

# 表示图片数量,28行,28列,1个颜色通道,-1表示样本数量不定
x_image = tf.reshape(xs,[-1,28,28,1])

然后，定义分配系数函数、分配偏置函数、卷积函数、pooling函数：

# 定义weight变量 
def weight_variable(shape):
    initial = tf.truncated_normal(shape,stddev=0.1) # 均值0标准方差0.1，剔除2倍标准方差之外的随机数据
    return tf.Variable(initial)

# 定义bias变量 
def bias_variable(shape):
    initial = tf.constant(0.1,shape=shape) #统一值0.1
    return tf.Variable(initial)

# 卷积函数
def conv2d(x, W):
    # strides第0位和第3为一定为1，剩下的是卷积的横向和纵向步长
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')

# 定义池化层
def max_pool_2x2(x):
    #Size（2,2），最大池化,参数同上，ksize是池化块的大小,2*2表示4个数据降维一半
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1], padding='SAME')

然后定义CNN神经网络：

这里写图片描述

# 第一层卷积加池化
# 第一二参数值得卷积核尺寸大小，即patch，第三个参数是图像通道数，第四个参数是卷积核的数目，代表会出现多少个卷积特征
W_conv1 = weight_variable([5,5,1,32])# patch 5*5, in size 1, out size 32
b_conv1 = bias_variable([32])
# 第1个卷积层，使用了ReLU激活函数
h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1) + b_conv1) # output size 28*28*32
h_pool1 = max_pool_2x2(h_conv1)                         # output size 14*14*32

# 第二层卷积加池化
W_conv2 = weight_variable([5,5,32,64])                  #32通道生成64通道数据
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2) + b_conv2) # output size 14*14*64
h_pool2 = max_pool_2x2(h_conv2)                         # output size 7*7*64

# 原图像尺寸28*28，第一轮图像缩小为14*14，共有32张，第二轮后图像缩小为7*7，共有64张

# 全链接层系数
W_fc1 = weight_variable([7*7*64,1024])
b_fc1 = bias_variable([1024])

# 全链接层：把64通道数据展开方便全链接
h_pool2_flat = tf.reshape(h_pool2,[-1,7*7*64]) # 展开，第一个参数为样本数量，-1未知

# Dropout的好处就是每次丢掉随机的数据，让神经网络每次都学习到更多，防止过拟合
f_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,w_fc1)+b_fc1)
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)
h_fc1_drop = tf.nn.dropout(h_fc1,keep_prob)


# softmax层系数
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])

# softmax层
prediction = tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)

最后计算损失，使得损失最小。

# 定义交叉熵为loss函数
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),reduction_indices=[1]))

# 调用优化器优化
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

结果如下：

这里写图片描述

从结果可以看出，基于CNN的mnist数字识别，精度可以达到97%，比传统的神经网络要强上很多。

完整代码：

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import numpy as np


# 计算准确度函数
def compute_accuracy(v_xs,v_ys):
    global prediction
    y_pre = sess.run(prediction,feed_dict={xs:v_xs, keep_prob: 1})
    correct_prediction = tf.equal(tf.argmax(y_pre,1),tf.argmax(v_ys,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
    result = sess.run(accuracy,feed_dict={xs:v_xs,ys:v_ys})
    return result

# 定义权重
def weight_variable(shape):
    initial = tf.truncated_normal(shape,stddev=0.1)
    return tf.Variable(initial)

# 定义偏置
def bias_variable(shape):
    initial = tf.constant(0.1,shape=shape)
    return tf.Variable(initial)

# 定义卷积层
def conv2d(x, W):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')

# 定义池化层
def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1], padding='SAME')


# 读取数据
mnist = input_data.read_data_sets('MNIST_data',one_hot=True)

# 查看图片
plt.imshow(mnist.train.images[1].reshape(28,28))
plt.title('the picture is %i' %np.argmax(mnist.train.labels[1]),fontdict={'size':16,'color':'c'})
plt.show()


# 数据的形状
xs = tf.placeholder(tf.float32,[None,784]) #28*28
ys = tf.placeholder(tf.float32,[None,10])
# keep_prob是保留概率，即我们要保留的RELU的结果所占比例
keep_prob = tf.placeholder(tf.float32)
# 传入卷积神经网络图片形状
x_image = tf.reshape(xs,[-1,28,28,1]) # 表示图片数量,28行,28列,1个颜色通道


# 第一层卷积加池化
W_conv1 = weight_variable([5,5,1,32])                   # patch 5*5, in size 1, out size 32
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1) + b_conv1) # output size 28*28*32
h_pool1 = max_pool_2x2(h_conv1)                         # output size 14*14*32

# 第二层卷积加池化
W_conv2 = weight_variable([5,5,32,64])                  # patch 5*5, in size 32, out size 64
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2) + b_conv2) # output size 14*14*64
h_pool2 = max_pool_2x2(h_conv2)                         # output size 7*7*64

# 第一层全链接层
W_fc1 = weight_variable([7*7*64,1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2,[-1,7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)
h_fc1_drop = tf.nn.dropout(h_fc1,keep_prob)


# 第二层全链接层
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])
prediction = tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)

# 定义交叉熵为loss函数
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),reduction_indices=[1]))

# 调用优化器优化
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

sess = tf.Session()

sess.run(tf.initialize_all_variables())

for i in range(1000):
    batch_xs,batch_ys = mnist.train.next_batch(100)
    # training train_step 和 loss 都是由 placeholder 定义的运算，所以这里要用 feed 传入参数
    sess.run(train_step, feed_dict={xs:batch_xs, ys:batch_ys,keep_prob: 0.5})  # 训练
    # 每训练50次显示一次当前的损失函数的值
    if i % 50 == 0:
        print(compute_accuracy(
            mnist.test.images,mnist.test.labels
        ))