深度学习 Day9——T9猫狗识别2-优快云博客

🍨 本文为🔗365天深度学习训练营中的学习记录博客
🍖 原作者：K同学啊 | 接辅导、项目定制

前言

本文将采用CNN实现猫狗识别。简单讲述实现代码与执行结果，并浅谈涉及知识点。
关键字：matplotlib之pyplot模块之标题,matplotlib之pyplot模块中文字体设置。

一、我的环境

电脑系统：Windows 11
语言环境：python 3.8.6
编译器：pycharm2020.2.3
深度学习环境：TensorFlow 2.10.1
显卡：NVIDIA GeForce RTX 4070

二、代码实现与执行结果

1.引入库

from PIL import Image
import numpy as np
from pathlib import Path
import tensorflow as tf
from tensorflow.keras import datasets, layers, models, regularizers
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras import layers, models, Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout, BatchNormalization
from tqdm import tqdm
import tensorflow.keras.backend as K
import matplotlib.pyplot as plt
import warnings

warnings.filterwarnings('ignore')  # 忽略一些warning内容，无需打印

2.设置GPU（如果使用的是CPU可以忽略这步）

'''前期工作-设置GPU（如果使用的是CPU可以忽略这步）'''
# 检查GPU是否可用
print(tf.test.is_built_with_cuda())
gpus = tf.config.list_physical_devices("GPU")
print(gpus)
if gpus:
    gpu0 = gpus[0]  # 如果有多个GPU，仅使用第0个GPU
    tf.config.experimental.set_memory_growth(gpu0, True)  # 设置GPU显存用量按需使用
    tf.config.set_visible_devices([gpu0], "GPU")

执行结果

True
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

3.导入数据

'''前期工作-导入数据'''
data_dir = r"D:\DeepLearning\data\CatDog"
data_dir = Path(data_dir)

4.查看数据

'''前期工作-查看数据'''
image_count = len(list(data_dir.glob('*/*.png')))
print("图片总数为：", image_count)
image_list = list(data_dir.glob('cat/*.png'))
image = Image.open(str(image_list[1]))
# 查看图像实例的属性
print(image.format, image.size, image.mode)
plt.imshow(image)
plt.show()

执行结果：

图片总数为： 3400
JPEG (512, 512) RGB

在这里插入图片描述

5.加载数据

 '''数据预处理-加载数据'''
batch_size = 64
img_height = 224
img_width = 224
"""
关于image_dataset_from_directory()的详细介绍可以参考文章：https://mtyjkh.blog.youkuaiyun.com/article/details/117018789
"""
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)
# 我们可以通过class_names输出数据集的标签。标签将按字母顺序对应于目录名称。
class_names = train_ds.class_names
print(class_names)

运行结果：

Found 3400 files belonging to 2 classes.
Using 2720 files for training.
Found 3400 files belonging to 2 classes.
Using 680 files for validation.
['cat', 'dog']

6.再次检查数据

'''数据预处理-再次检查数据'''
# Image_batch是形状的张量（32,180,180,3）。这是一批形状180x180x3的32张图片（最后一维指的是彩色通道RGB）。
# Label_batch是形状（32，）的张量，这些标签对应32张图片
for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break

运行结果

(64, 224, 224, 3)
(64,)

7.配置数据集

AUTOTUNE = tf.data.AUTOTUNE

def preprocess_image(image,label):
    return (image/255.0,label)

# 归一化处理
train_ds = train_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)
val_ds   = val_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds   = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

如果报 AttributeError: module ‘tensorflow._api.v2.data’ has no attribute ‘AUTOTUNE’ 错误，就将 AUTOTUNE = tf.data.AUTOTUNE 更换为 AUTOTUNE = tf.data.experimental.AUTOTUNE，这个错误是由于版本问题引起的

8.可视化数据

'''数据预处理-可视化数据'''
plt.figure(figsize=(18, 3))
for images, labels in train_ds.take(2):
    for i in range(8):
        ax = plt.subplot(1, 8, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]], fontsize=40)
        plt.axis("off")
# 显示图片
plt.show()

在这里插入图片描述

9.构建CNN网络模型

'''构建CNN网络'''
def VGG16(nb_classes, input_shape):
    input_tensor = Input(shape=input_shape)
    # 1st block
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(input_tensor)
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
    # 2nd block
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
    # 3rd block
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
    # 4th block
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
    # 5th block
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
    # full connection
    x = Flatten()(x)
    x = Dense(4096, activation='relu', name='fc1')(x)
    x = Dense(4096, activation='relu', name='fc2')(x)
    output_tensor = Dense(nb_classes, activation='softmax', name='predictions')(x)

    model = Model(input_tensor, output_tensor)
    return model
model = VGG16(1000, (img_width, img_height, 3))
model.summary()

网络结构结果如下：

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 56, 56, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 56, 56, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 28, 28, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 28, 28, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 28, 28, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 14, 14, 512)       0         
                                                                 
 block5_conv1 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 14, 14, 512)       2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 7, 7, 512)         0         
                                                                 
 flatten (Flatten)           (None, 25088)             0         
                                                                 
 fc1 (Dense)                 (None, 4096)              102764544 
                                                                 
 fc2 (Dense)                 (None, 4096)              16781312  
                                                                 
 predictions (Dense)         (None, 2)                 8194      
                                                                 
=================================================================
Total params: 134,268,738
Trainable params: 134,268,738
Non-trainable params: 0
_________________________________________________________________

10.编译模型

'''编译模型'''
model.compile(optimizer="adam",
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

11.训练模型

'''训练模型'''
epochs = 10
lr = 1e-4

# 记录训练数据，方便后面的分析
history_train_loss = []
history_train_accuracy = []
history_val_loss = []
history_val_accuracy = []

for epoch in range(epochs):
    train_total = len(train_ds)
    val_total = len(val_ds)

    """
    total：预期的迭代数目
    ncols：控制进度条宽度
    mininterval：进度更新最小间隔，以秒为单位（默认值：0.1）
    """
    with tqdm(total=train_total, desc=f'Epoch {epoch + 1}/{epochs}', mininterval=1, ncols=120) as pbar:

        lr = lr * 0.92
        K.set_value(model.optimizer.lr, lr)

        train_loss = []
        train_accuracy = []
        for image, label in train_ds:
            """
            训练模型，简单理解train_on_batch就是：它是比model.fit()更高级的一个用法
            """
            # 这里生成的是每一个batch的acc与loss
            history = model.train_on_batch(image, label)

            train_loss.append(history[0])
            train_accuracy.append(history[1])

            pbar.set_postfix({"train_loss": "%.4f" % history[0],
                              "train_acc": "%.4f" % history[1],
                              "lr": K.get_value(model.optimizer.lr)})
            pbar.update(1)

        history_train_loss.append(np.mean(train_loss))
        history_train_accuracy.append(np.mean(train_accuracy))

    with tqdm(total=val_total, desc=f'Epoch {epoch + 1}/{epochs}', mininterval=0.3, ncols=120) as pbar:

        val_loss = []
        val_accuracy = []
        for image, label in val_ds:
            # 这里生成的是每一个batch的acc与loss
            history = model.test_on_batch(image, label)

            val_loss.append(history[0])
            val_accuracy.append(history[1])

            pbar.set_postfix({"val_loss": "%.4f" % history[0],
                              "val_acc": "%.4f" % history[1]})
            pbar.update(1)
        history_val_loss.append(np.mean(val_loss))
        history_val_accuracy.append(np.mean(val_accuracy))

训练记录如下：

Epoch 1/10: 100%|███████████████████████| 43/43 [00:39<00:00,  1.08it/s, train_loss=0.4403, train_acc=0.7500, lr=9.2e-5]
Epoch 1/10: 100%|██████████████████████████████████████| 11/11 [00:03<00:00,  2.97it/s, val_loss=0.5084, val_acc=0.7250]
Epoch 2/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.64it/s, train_loss=0.1960, train_acc=0.9375, lr=8.46e-5]
Epoch 2/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.58it/s, val_loss=0.2440, val_acc=0.9000]
Epoch 3/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.63it/s, train_loss=0.1424, train_acc=0.9531, lr=7.79e-5]
Epoch 3/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.31it/s, val_loss=0.1420, val_acc=0.9750]
Epoch 4/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.63it/s, train_loss=0.0187, train_acc=1.0000, lr=7.16e-5]
Epoch 4/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.58it/s, val_loss=0.0803, val_acc=0.9750]
Epoch 5/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.63it/s, train_loss=0.0381, train_acc=0.9844, lr=6.59e-5]
Epoch 5/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.61it/s, val_loss=0.0712, val_acc=0.9500]
Epoch 6/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.59it/s, train_loss=0.1966, train_acc=0.9375, lr=6.06e-5]
Epoch 6/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.12it/s, val_loss=0.0186, val_acc=1.0000]
Epoch 7/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.59it/s, train_loss=0.0389, train_acc=0.9688, lr=5.58e-5]
Epoch 7/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.08it/s, val_loss=0.0170, val_acc=1.0000]
Epoch 8/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.57it/s, train_loss=0.0032, train_acc=1.0000, lr=5.13e-5]
Epoch 8/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.17it/s, val_loss=0.0439, val_acc=0.9750]
Epoch 9/10: 100%|██████████████████████| 43/43 [00:16<00:00,  2.59it/s, train_loss=0.0491, train_acc=0.9688, lr=4.72e-5]
Epoch 9/10: 100%|██████████████████████████████████████| 11/11 [00:01<00:00,  7.57it/s, val_loss=0.0836, val_acc=0.9500]
Epoch 10/10: 100%|█████████████████████| 43/43 [00:16<00:00,  2.63it/s, train_loss=0.0077, train_acc=1.0000, lr=4.34e-5]
Epoch 10/10: 100%|█████████████████████████████████████| 11/11 [00:01<00:00,  7.58it/s, val_loss=0.0207, val_acc=1.0000]

12.模型评估

'''模型评估'''
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

执行结果
在这里插入图片描述

13.指定图片进行预测

'''指定图片进行预测'''
plt.figure(figsize=(18, 4))  # 图形的宽为18高为3
plt.suptitle("预测结果展示")
for images, labels in val_ds.take(1):
    for i in range(8):
        ax = plt.subplot(1, 8, i + 1)
        # 显示图片
        plt.imshow(images[i].numpy())
        # 需要给图片增加一个维度
        img_array = tf.expand_dims(images[i], 0)
        # 使用模型预测图片中的人物
        predictions = model.predict(img_array)
        plt.title(class_names[np.argmax(predictions)], fontsize=40)
        plt.axis("off")
# 显示图片
plt.show()

执行结果

1/1 [==============================] - 0s 424ms/step
1/1 [==============================] - 0s 12ms/step
1/1 [==============================] - 0s 13ms/step
1/1 [==============================] - 0s 14ms/step
1/1 [==============================] - 0s 12ms/step
1/1 [==============================] - 0s 13ms/step
1/1 [==============================] - 0s 13ms/step
1/1 [==============================] - 0s 15ms/step

三、知识点详解

1.matplotlib之pyplot模块之标题

1.1 使用title()设置子图标题

title()可同时在子图中显示中间、左侧、右侧3个标题。
函数签名为matplotlib.pyplot.title(label, fontdict=None, loc=None, pad=None, *, y=None, **kwargs)
参数作用及取值如下：

label：类型为字符串，即标题文本。
fontdict：类型为字典，控制文本的字体属性。默认值为：

{'fontsize': rcParams['axes.titlesize'],
 'fontweight': rcParams['axes.titleweight'],
 'color': rcParams['axes.titlecolor'],
 'verticalalignment': 'baseline',
 'horizontalalignment': loc}

loc：取值范围为{‘left’, ‘center’, ‘right’}，默认值为rcParams[“axes.titlelocation”]（‘center’），即标题的位置。
pad：类型为浮点数，默认值为default: rcParams[“axes.titlepad”] (6.0)。即标题与子图的填充距离（内边距）。
y：类型为浮点数，默认值为rcParams[“axes.titley”] (None)。即标题在子图中的垂直距离，单位为子图高度的百分比，1.0在子图最顶部，默认值None则自动确定标题位置，避免与其他元素重叠。
**kwargs：Text 对象关键字属性，用于控制文本的外观属性，如字体、文本颜色等。

返回值为Text对象。

title()相关rcParams为：

#axes.titlelocation: center  # alignment of the title: {left, right, center}
#axes.titlesize:     large   # fontsize of the axes title
#axes.titleweight:   normal  # font weight of title
#axes.titlecolor:    auto    # color of the axes title, auto falls back to
                             # text.color as default value
#axes.titley:        None    # position title (axes relative units).  None implies auto
#axes.titlepad:      6.0     # pad between axes and title in points

底层相关方法为：

Axes.set_title(self, label, fontdict=None, loc=None, pad=None, *, y=None, **kwargs)
Axes.get_title(self, loc='center')：注意返回指定位置的标题文本。

案例
同时设置3个子图标题。

import matplotlib.pyplot as plt

# 注意，子图可以同时设置中间、左侧、右侧3个标题
plt.plot([1, 1])
# 在右侧底部显示子图标题
plt.title("right bottom",y=0,loc='right')
# 在左侧顶部显示子图标题
plt.title("left top",y=1,loc='left')
# 显示默认子图标题
plt.title("default")
plt.show()

在这里插入图片描述

1.2 使用suptitle()设置图像标题

为图像添加一个居中标题。
函数签名为matplotlib.pyplot.suptitle(t, **kwargs)
参数作用及取值如下：

t：类型为字符串，即标题文本。
x：类型为浮点数，即标题在图像水平方向相对位置，默认值为0.5。
y：类型为浮点数，即标题在图像垂直方向相对位置，默认值为0.98。
fontdict：类型为字典，控制文本的字体属性。默认值为：

{'fontsize': rcParams['axes.titlesize'],
 'fontweight': rcParams['axes.titleweight'],
 'color': rcParams['axes.titlecolor'],
 'verticalalignment': 'baseline',
 'horizontalalignment': loc}

horizontalalignment, ha：类型为字符串，取值范围{‘center’, ‘left’, right’}，默认值为’center’，即相对于(x,y)的水平方向对齐方式。
verticalalignment, va：类型为字符串，取值范围{‘top’, ‘center’, ‘bottom’, ‘baseline’}，默认值为’top’，即相对于(x,y)的垂直方向对齐方式。
fontsize, size：取值范围为浮点数或{‘xx-small’, ‘x-small’, ‘small’, ‘medium’, ‘large’, ‘x-large’, ‘xx-large’}，默认值为rcParams[“figure.titlesize”] (‘large’)，文本的字体大小。
fontweight, weight：取值范围详见文档，字即文本的字重。
**kwargs：Text 对象关键字属性，用于控制文本的外观属性，如字体、文本颜色等。
返回值为Text对象。

suptitle()相关rcParams为：

#figure.titlesize:   large     # size of the figure title (``Figure.suptitle()``)
#figure.titleweight: normal    # weight of the figure title

案例
添加图像标题，并设置坐标、字体大小、文本颜色等属性。

import matplotlib.pyplot as plt

plt.plot([1, 1])
plt.title("title")
plt.suptitle("suptitle", x=0.1, y=0.98, fontsize=16, color='red')

plt.show()

在这里插入图片描述
参考链接：matplotlib之pyplot模块之标题（title()和suptitle()）

2.matplotlib之pyplot模块中文字体设置

方法一：

import matplotlib
matplotlib.rc("font", family='DengXian')  # 设置图片中的中文字体

方法二：

import matplotlib.pyplot as plt
# 支持中文
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号