使用TensorFlow搭建模型之softmax模型详细解读(案例实战)

最新推荐文章于 2025-04-15 11:41:04 发布

AI算法联盟

最新推荐文章于 2025-04-15 11:41:04 发布

阅读量1k

点赞数

分类专栏： TensorFlow 2.0应用与实战

本文链接：https://blog.youkuaiyun.com/weixin_40922285/article/details/102495134

版权

TensorFlow 2.0应用与实战专栏收录该内容

15 篇文章

订阅专栏

这篇博客详细解读了softmax模型，从mnist数据集介绍开始，解释了数据集的特点和one-hot编码。接着，深入讲解了softmax模型的原理，如何将特征转化为概率，并通过TensorFlow构建模型。最后展示了模型在测试集上达到92%的准确率，但表示会进一步优化提升。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、mnist数据集说明

1.mnist数据集包含了各种手写数字图片，像素大小为28*28，灰色图像(图像形状为(28,28,1))，且有对应的标签，标注出这是数字几。图片像素数值介于0到1之间。如图对应的标签为5,0,4,1。

2.mnist数据集有55000张训练图像，10000张测试图像和5000张验证图像。784是由28*28平铺张开得到。

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('mnistdata/',one_hot = True)
print(mnist.train.images.shape, mnist.train.labels.shape)
print(mnist.test.images.shape, mnist.test.labels.shape)
print(mnist.validation.images.shape, mnist.validation.labels.shape)

3.标签值已采用one-hot编码，即one-hot向量上，除了该数字对应的位置为1，其余维度上数值都为0，比如数字5用one-hot向量表示为[0,0,0,0,0,1,0,0,0,0]。

二、softmax模型解读。

这里说明一个概念，就是概率。我们识别一张图片是数字几，得出的结果也是用概率说明，如该图片识别的结果有70%的概率是0，有20%的概率是6，有10%的概率是9，那么我们就会判定该数字是0。softmax模型就是这样的思想。接下来我们分析如何求解概率。每张图片的形状都是(784,)的，图像的每个像素点代表一个特征，共784个特征。每张图片都是分开去预测的，数字0~9有十类。使用线性模型把这些特征总结起来，得到一个证据，通俗的讲就是得出该图片上每类数字的权重。公式如下：

i 取值为0~9，表示判定为某个数字。bi为某类数字的偏置。把证据(权重)转为概率。

softmax函数就是把每个证据evidence综合起来，得出预判结果对应的概率，所有预判结果的概率累加起来等于1。softmax公式展开：

softmax函数的输入值xi对应的就是证据evidence。这里涉及到幂运算，它有两个作用：一是让更大的证据对应更大的假设模型的结果概率，二是避免概率为负值或者为零。上面两个公式合并起来：

三、使用Tensorflow搭建模型

直接上代码，看注释就可以。

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import pandas as pd
#下载mnist数据集,55000张训练图像，10000张测试图像和5000张验证图像
mnist = input_data.read_data_sets('mnistdata/',one_hot = True)
'''
数据集的可视化，交互环境检验
print(mnist.train.images.shape, mnist.train.labels.shape)
print(mnist.test.images.shape, mnist.test.labels.shape)
print(mnist.validation.images.shape, mnist.validation.labels.shape)
labels = pd.DataFrame(mnist.train.labels)
labels.apply(lambda x:x.sum())
'''
# 定义批次batch_size，一次性放入100张图片
batch_size = 100
# 计算一轮有多少个批次
n_batch = mnist.train.num_examples // batch_size
#占位符placeholder，后面通过feed_dict喂入数据
x = tf.placeholder("float", [None, 784])
y_ = tf.placeholder("float", [None,10]) #y_为真实值标签
#variable张量，模型训练过程不断优化变动的张量
W = tf.Variable(tf.zeros([784,10])) #权重
b = tf.Variable(tf.zeros([10]))  #偏置
#使用softmax回归模型
y = tf.nn.softmax(tf.matmul(x,W) + b)
#使用交叉熵
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
#随机梯度下降优化器
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
#变量初始化
init = tf.initialize_all_variables()
#求准确率
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(1,101):
        for j in range(n_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
        acc = sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
        if epoch % 10 == 0 :
            print("Iter" + str(epoch) + ",Testing Accuracy：" + str(acc))

最终的模型测试集上的准确率达92%，效果有点差，下篇进行优化提升。

运行脚本结果：