手写数字识别CNN实战-优快云博客

最近学习cnn，老师要求完成一份图片识别的案例，参照了很多网上博客的做法，加上自己的研究

在 Google Colaboratory 写的

推荐一位博主的博客，对卷积，池化，以及多卷积核，多通道等讲的十分透彻：

try:
  # Colab only
  %tensorflow_version 2.x
except Exception:
    pass

import numpy as np
import tensorflow as tf
import keras
from keras.datasets import mnist # 这里是从keras的datasets中导入mnist数据集
from keras.utils import np_utils 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.layers import Activation
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dropout 
from keras.layers import Conv2D
import matplotlib.pyplot as plt
from keras.optimizers import Adam

(X_train,y_train),(X_test,y_test) = mnist.load_data()
#  导入数据集

#  展示效果
fig = plt.figure()
for i in range(9):
  plt.subplot(3,3,i+1)
  plt.tight_layout()
  plt.imshow(X_train[i], cmap='gray', interpolation='none')
  plt.title("Digit: {}".format(y_train[i]))
  plt.xticks([])
  plt.yticks([])

X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
y_train = np_utils.to_categorical(y_train, num_classes=10) 
y_test = np_utils.to_categorical(y_test,num_classes=10)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

model = Sequential()
#convolutional layer with rectified linear unit activation
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                #  border_mode='same',
                 input_shape=(28,28,1)))

model.add(Conv2D(64, (3, 3), activation='relu', strides=2)) # border_mode='same', 

#choose the best features via pooling
model.add(MaxPooling2D(pool_size=(2, 2)))

#randomly turn neurons on and off to improve convergence
model.add(Dropout(0.25))

#flatten since too many dimensions, we only want a classification output
model.add(Flatten())

#fully connected to get all relevant data
model.add(Dense(128, activation='relu'))

#one more dropout for convergence' sake :) 
model.add(Dropout(0.25)) 

#output a softmax to squash the matrix into output probabilities
model.add(Dense(10, activation='softmax'))

model.summary()

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

batch_size = 128
print("\n---------------------------------------Training---------------------------------------")
model_log = model.fit(X_train, y_train,
          batch_size=batch_size,
          epochs=1,
          verbose=1)

print("\n---------------------------------------Training-Result---------------------------------------")
trainLoss,trainAccuracy = model.evaluate(X_train,y_train)
print('\ntrain loss:', trainLoss)
print('\ntrain accuracy:', trainAccuracy)

print("\n---------------------------------------Testing-Result---------------------------------------")
loss,accuracy = model.evaluate(X_test,y_test)
print('\ntrain loss:', loss)
print('\ntrain accuracy:', accuracy)

大致讲一下上述的模型吧，上述模型使用了官方提供的手写数字的数据库，60000张训练图片，10000张测试图片，然后将训练和测试的图片（X）改成（数量，28，28，1）其中数量是此训练数组的size,28*28是图片的长宽比，1是通道的个数。y值采用工具包向量化，比如一张图片的值是1，则向量化之后的值为（0 0 0 0 0 0 0 0 1 0）。

模型的重头戏，构建模型的层次：（不了解下述内容的可以去看页首的博客，讲的挺好的）

1.建立第一层卷积层

这里卷积核的个数为32个，卷积核大小为 3*3，激活函数为'relu'，inpu_shape为你输入该层图片的格式，我们的图片格式28*28*1，border_mode这个参数简单来说是指图片经过该层后的大小，如为默认，图片经过该层的大小应该为26*26，计算公式为：（图片原长 - 卷积核长 + 2*需要填充的0的个数）/ strides+ 1；默认（28 - 3 - 2*0）/1+1 = 26，宽类推。这里的strides默认为1，具体用法参照首页博客。若将border_mode = 'same' 则输出结果的图片大小不改变。图片输出结果应为：（26，26，32），图片经过32个卷积核变为32层（卷积核内容是可修正的，基本各不相同）。也就是这层可修正的参数值为：32*9+32这个公式是我自己根据结果倒推的，可能有错误，我的理解是：32个卷积核*（3*3）卷积核面积+32个偏置。

2.建立第二层卷积层