Deep Learning with Python 系列笔记（一）：深度学习基础

最新推荐文章于 2025-06-25 22:18:48 发布

原创

最新推荐文章于 2025-06-25 22:18:48 发布 · 7k 阅读

34 ·

CC 4.0 BY-SA版权

神经网络的初探

现在来看一个神经网络的第一个具体例子，它利用了Python库Keras来学习对手写数字进行分类。
Mnist是一个含有10类的28 * 28 灰度图片，可以将“解决”MNIST看作是深度学习的“Hello World”，需要做的是验证实现的算法是否按预期工作。

在Keras上加载Mnist数据集

from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images和train_label组成了“训练集”，模型将从数据中学习。然后，该模型将在“测试集”：test_images和test_label上进行测试。我们的图像被编码为Numpy数组，而标签只是一组数字，从0到9，图像和标签之间存在一一对应关系。

The training data


>>> train_images.shape
(60000, 28, 28)
>>> len(train_labels)
60000
>>> train_labels
array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

The test data

>>> test_images.shape
(10000, 28, 28)
>>> len(test_labels)
10000
>>> test_labels
array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

我们的工作如下:首先，我们将用训练数据:train_images和train_label来训练我们的神经网络。然后，网络学习将图像和标签关联起来。最后，我们将要求网络对test_images进行预测，我们将验证这些预测是否与test_label中的标签匹配。

网络结构

from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(512,activation='relu',input_shape=(28*28,))) # 全连接层：512个神经元，激活函数：relu，输入大小： 28*28
network.add(layers.Dense(10,activation='softmax')) # 输出层：返回10个类别的概率

在这里，我们的网络由两层密集的层组成，它们是紧密相连的(全连接层)神经层。第二个(即最后一个)层是一个10类的“softmax”层，这意味着它将返回一个10个概率值的数组(总和为1)。

为了使我们的网络为培训做好准备，我们需要定义另外三个参数，作为“编译”步骤的一部分。

**1.损失函数：**网络衡量它的学习性能，以及它如何能够定义网络朝着正确的方向前进。
**2.优化参数：**这是网络根据数据和损失函数更新自身的机制，如：SGD、Rmsprop等
3.度量指标：accuracy等

网络编译

network.compile(optimizer='rmsprop',
		loss='categorical_crossentropy',
		metrics=['accuracy'])

在训练之前，我们将对数据进行预处理，将其修改成网络期望的形状，并将其缩放，使所有值都在[0,1]区间内。
未处理前，我们的训练图像存储在uint8类型的数组中(60000,28,28)，值在[0,255]区间内。我们将它转换为一个浮点数(60000,28 * 28)，值在0和1之间。

Preparing the image data

train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

我们还需要对标签进行分类编码。

from keras.utils import to_categorical
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

我们现在已经准备好训练我们的网络，这在Keras是通过对网络的 fit 方法的调用来完成的:我们将模型与它的训练数据“匹配”。

训练网络


>>> network.fit(train_images, train_labels, epochs=5, batch_size=128)
Epoch 1/5
60000/60000 [==============================] - 9s - loss: 0.2524 - acc: 0.9273
Epoch 2/5
51328/60000 [========================>.....] - ETA: 1s - loss: 0.1035 - acc: 0.9692

我们在训练数据上很快达到0.989(即98.9%)的精确度。

验证网络

test_loass, test_acc = network.evaluate(test_images, test_labels)
print('test_acc': test_acc)
>>test_acc: 0.9785

我们的测试集准确度为97.8%，比训练集的准确度要低很多。训练准确性和测试精度之间的差距是“过度拟合”的一个例子，即机器学习模型在新数据上的表现往往比训练数据差。

张量（tensors）

标量Scalars( 0D tensors)

一个只包含一个数字的张量称为“标量”(或“标量张量”，即0维张量，或0D张量)。在Numpy中，float32或float64数字是一个标量张量(或标量数组)。可以通过ndim属性显示一个Numpy张量的轴数;标量张量有0个轴(ndim == 0)，张量的轴数也称为秩。


>>> import numpy as np
>>> x = np.array(12)
>>> x
array(12)
>>> x.ndim
0

向量Vectors ( 1D tensors)

一组数字被称为向量，即1D张量。一个1D张量将被说成只有一个“轴”。


>>> x = np.array([12, 3, 6, 14])
>>> x
array([12, 3, 6, 14])
>>> x.ndim
1

在这里，这个向量有5个元素，所以将被称为一个“5维向量”。不要把一个5D的矢量和一个5D张量混淆!一个5D的矢量只有一个轴，并且沿着它的轴有5个维度，而5D张量有5个轴(并且可能在每个轴上有任意数量的尺寸)。

矩阵Matrices ( 2D tensors)

向量的数组是一个矩阵，或者说二维张量。矩阵有两个轴(通常表示“行”和“列”)。你可以把一个矩阵直观地解释为一个矩形的数字网格。


>>> x = np.array([[5, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]])
>>> x.ndim
2

第一个轴称为“行”，第二个轴称为“列”。在上面的例子中，[5,78,2,34,0]是第一行，[5,6,7]是第一列。

3D tensors and higher-dimensional tensors


>>> x = np.array([[[5, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]],
[[5, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]],
[[5, 78, 2, 34, 0],
[6, 79, 3, 35, 1],
[7, 80, 4, 36, 2]]])
>>> x.ndim
3

通过在数组中封装3D张量，您可以创建一个4D张量。等等。在深度学习中，您通常会操作0D到4D的张量，但如果处理视频数据，则可能会达到5D。

tensor 关键属性

一个tensor由3个关键属性定义
**1.axes的数量：秩。**例如，一个三维张量有3个轴，一个矩阵有2个轴。这也被称为张量的ndim，Python库中如Numpy。
**2.形状。**这是一个整数的元组，它描述了张量在每个轴上的大小。例如，上面的矩阵示例有形状(3,5)，而我们的三维张量示例有形状(3、3、5)，一个向量的形状只有一个元素，比如(5，)，而标量将有一个空的形状()。
**3.数据类型：**在Python库中通常称为dtype。tensor中包含的数据类型;例如，float32, uint8, float64…

为了使这个更加具体，让我们回顾一下在我们的MNIST示例中处理的数据：

from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
>>> print(train_images.ndim)
3
>>> print(train_images.shape)
(60000, 28, 28)
>>> print(train_images.dtype)
uint8

这里是一个8-bit integers 的3D tensor。更准确地说，它是一个60000个矩阵，包含28x28个整数。每一个这样的矩阵都是灰度图像，系数在0到255之间。

让我们使用库Matplotlib(标准的科学Python套件的一部分)在这个3D tensor中显示第四个数:

digit = train_images[4]
import matplotlib.pyplot as plt
plt.imshow(digit, cmap=plt.cm.binary)
plt.show()

最低0.47元/天解锁文章

200万优质内容无限畅学