简单神经网络_mnist

最新推荐文章于 2025-06-15 16:57:56 发布

原创最新推荐文章于 2025-06-15 16:57:56 发布 · 977 阅读

4 ·

CC 4.0 BY-SA版权

文章标签：

#mnist #多分类评估 #tensorflow

Tensorflow 专栏收录该内容

4 篇文章

订阅专栏

本文通过使用TensorFlow 2搭建神经网络模型，实现对FashionMNIST数据集的分类任务。从数据加载、预处理到模型训练及评估，详细介绍每一步操作，并通过混淆矩阵等指标分析模型性能。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

导入模型所用包

示例来源链接：TF2对Fashion Mnist进行分类

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from  sklearn import metrics 
 
import numpy as np
import matplotlib.pyplot as plt

加载数据

加载fashion_mnist

# fashion_mnist
fashion_mnist = keras.datasets.fashion_mnist
 
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',  'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

加载mnist

# mnist
train_data = pd.read_csv('./mnist/mnist_train.csv')
test_data = pd.read_csv('./mnist/mnist_test.csv')

train_images = train_data.iloc[:,1:]
train_labels = train_data.iloc[:,:1]

test_images = test_data.iloc[:,1:]
test_labels = test_data.iloc[:,:1]
print (train_images.shape)
print (train_labels.shape)
print (test_images.shape)
print (test_labels.shape)

train_images = train_images.values.reshape(-1,28,28)
test_images = test_images.values.reshape(-1,28,28)

train_labels = train_labels.values
test_labels = test_labels.values

(59999, 784)
(59999, 1)
(9999, 784)
(9999, 1)

train_labels[:5]

	5
0	0
1	4
2	1
3	9
4	2

plt.imshow(train_images[0], cmap=plt.cm.binary);

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-WW5ynI1w-1620705536886)(output_9_0.png)]

数据预处理

train_images = train_images / 255.0
 
test_images = test_images / 255.0

建立模型

# 模型1
model = keras.Sequential([
    # 展平层，词为网络输入层
    keras.layers.Flatten(input_shape=(28, 28)),
    # 全连接层，128个神经单元
    keras.layers.Dense(128, activation='relu'),
    # 最终输出层，由于有10个类别，所以输出单元个数为10
    keras.layers.Dense(10)
])

# 模型2
model = keras.Sequential()
# model.add(Dense(64, activation='relu', input_dim=20)) 若输入为一维结构数据
model.add(keras.layers.Flatten(input_shape=(28, 28)))
model.add(keras.layers.Dense(128,activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(32, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))

模型编译

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

模型训练

train_images.shape

(59999, 28, 28)

model.fit(train_images, train_labels, epochs=10)

Epoch 1/10
1875/1875 [==============================] - 2s 998us/step - loss: 0.4298 - accuracy: 0.8793
Epoch 2/10
1875/1875 [==============================] - 2s 954us/step - loss: 0.1213 - accuracy: 0.9643
Epoch 3/10
1875/1875 [==============================] - 2s 991us/step - loss: 0.0742 - accuracy: 0.9766
Epoch 4/10
1875/1875 [==============================] - 2s 938us/step - loss: 0.0547 - accuracy: 0.9840
Epoch 5/10
1875/1875 [==============================] - 2s 943us/step - loss: 0.0388 - accuracy: 0.9882
Epoch 6/10
1875/1875 [==============================] - 2s 963us/step - loss: 0.0309 - accuracy: 0.9904
Epoch 7/10
1875/1875 [==============================] - 2s 989us/step - loss: 0.0248 - accuracy: 0.9928
Epoch 8/10
1875/1875 [==============================] - 2s 968us/step - loss: 0.0189 - accuracy: 0.9942
Epoch 9/10
1875/1875 [==============================] - 2s 985us/step - loss: 0.0149 - accuracy: 0.9956
Epoch 10/10
1875/1875 [==============================] - 2s 927us/step - loss: 0.0138 - accuracy: 0.9960





<tensorflow.python.keras.callbacks.History at 0x228098c23c8>

评估准确性

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
 
print('\nTest accuracy:', test_acc)
# accuracy: 0.9783

313/313 - 0s - loss: 0.0936 - accuracy: 0.9783

Test accuracy: 0.9782978296279907

做出预测

probability_model = tf.keras.Sequential([model, 
                                         tf.keras.layers.Softmax()])
predictions = probability_model.predict(test_images)

predictions[0]

array([3.7147963e-08, 2.2065356e-07, 9.9999964e-01, 1.5497315e-07,
       4.7679337e-21, 1.7254685e-09, 7.1944735e-09, 2.6051557e-15,
       5.2889224e-08, 9.9834775e-19], dtype=float32)

np.argmax(predictions[0])

plt.imshow(test_images[0], cmap=plt.cm.binary);

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-whKs1C6r-1620705536890)(output_26_0.png)]

验证预测

class_names_test = [1,1,2,3,2]

def plot_image(i, predictions_array, true_label, img):
    predictions_array, true_label, img = predictions_array, true_label[i][0], img[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
 
    plt.imshow(img, cmap=plt.cm.binary)
 
    predicted_label = np.argmax(predictions_array)
    if predicted_label == true_label:
        color = 'blue'
    else:
        color = 'red'

#     plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
#                                 100*np.max(predictions_array),
#                                 class_names[true_label]),
#                                 color=color)
    plt.xlabel("predicted_label is {},probability is {:2.0f}%,\n,true_label is {}".format(predicted_label,
                                100*np.max(predictions_array),
                                true_label),
                                color=color)
def plot_value_array(i, predictions_array, true_label):
    predictions_array, true_label = predictions_array, true_label[i][0]
    plt.grid(False)
    plt.xticks(range(10))
    plt.yticks([])
    thisplot = plt.bar(range(10), predictions_array, color="#777777")
    plt.ylim([0, 1])
    predicted_label = np.argmax(predictions_array)
#     print (predicted_label,true_label)
    thisplot[predicted_label].set_color('red')
    thisplot[true_label].set_color('blue')

i = 0
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yeqLfOuI-1620705536892)(output_30_0.png)]

多分类评估

多分类混淆矩阵

二分类混淆矩阵

true_label_test = [0,1,1,1,1,0,0,0,1,1]
pred_label_test = [0,0,1,0,1,0,1,0,1,1]
cfm_test = metrics.confusion_matrix(true_label_test, pred_label_test)
display (cfm_test)
tn, fp, fn, tp = cfm_test.ravel()
print ('matrix    predict0  predict1')
print ('label0    {:<6d}    {:<6d}'.format(int(tn),int(fp)))
print ('label1    {:<6d}    {:<6d}'.format(int(fn),int(tp)))

array([[3, 1],
       [2, 4]], dtype=int64)


matrix    predict0  predict1
label0    3         1     
label1    2         4

# 混淆矩阵行表示实际类别，列表示预测类别
cfm = metrics.confusion_matrix(test_labels, np.argmax(predictions,axis=1))
display (cfm)

array([[ 973,    0,    0,    1,    0,    0,    1,    1,    4,    0],
       [   0, 1116,    4,    1,    0,    1,    2,    1,   10,    0],
       [   3,    0, 1002,    8,    1,    0,    2,    5,    9,    2],
       [   0,    0,    1, 1000,    0,    2,    0,    2,    2,    3],
       [   1,    0,    2,    1,  961,    0,    6,    1,    1,    9],
       [   1,    0,    0,   22,    1,  859,    7,    0,    2,    0],
       [   3,    4,    1,    0,    2,    2,  946,    0,    0,    0],
       [   2,    2,    8,   10,    0,    0,    0,  999,    2,    4],
       [   3,    0,    2,   10,    3,    3,    1,    1,  947,    4],
       [   2,    2,    0,    6,    7,    3,    0,    5,    1,  983]],
      dtype=int64)

多分类召回率

# 混淆矩阵行表示实际类别，列表示预测类别
# 召回率计算时，就是对角线上的值（预测对的）除以该行的求和
for i in range(len(cfm)):
    print ('Number %d recall is '%i,cfm[i,i]/np.sum(cfm[i,:]))

Number 0 recall is  0.9928571428571429
Number 1 recall is  0.9832599118942731
Number 2 recall is  0.9709302325581395
Number 3 recall is  0.9900990099009901
Number 4 recall is  0.9786150712830958
Number 5 recall is  0.9630044843049327
Number 6 recall is  0.9874739039665971
Number 7 recall is  0.9727361246348588
Number 8 recall is  0.9722792607802875
Number 9 recall is  0.9742319127849356

多分类准确率

# 混淆矩阵行表示实际类别，列表示预测类别
# 召回率计算时，就是对角线上的值（预测对的）除以该列的求和
for i in range(len(cfm)):
    print ('Number %d precision is '%i,cfm[i,i]/np.sum(cfm[:,i]))

Number 0 precision is  0.9848178137651822
Number 1 precision is  0.9928825622775801
Number 2 precision is  0.9823529411764705
Number 3 precision is  0.9442870632672332
Number 4 precision is  0.9856410256410256
Number 5 precision is  0.9873563218390805
Number 6 precision is  0.9803108808290155
Number 7 precision is  0.9842364532019704
Number 8 precision is  0.9683026584867076
Number 9 precision is  0.9781094527363184

多分类正确率

print (np.sum(np.diagonal(cfm)))
print (np.sum(cfm))
print('Test accuracy:', round(np.sum(np.diagonal(cfm))/np.sum(cfm),4))

9786
9999
Test accuracy: 0.9787

将数据与灰度值对应起来

# cmap为颜色映射，gray为像素灰度值
plt.matshow(cfm,cmap=plt.cm.gray);

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Z90h1A5v-1620705536894)(output_43_0.png)]

去除预测正确的对角线数据，查看混淆矩阵中的其他值

下图不仅可以看出哪个地方犯的错误多，还可以看出是什么样的错误，例如，真实数字5，容易被预测为3.

row_sum = np.sum(cfm,axis=1)
err_matrix = cfm / row_sum
np.fill_diagonal(err_matrix,0) # 矩阵对角线填充0

plt.matshow(err_matrix,cmap=plt.cm.gray);

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rvpil96P-1620705536895)(output_45_0.png)]