预训练模型-VGG16模型的构建，批量图片预测、类激活图以及ROC曲线结果

最新推荐文章于 2025-05-20 16:04:20 发布

努力学计算机的菜鸡

最新推荐文章于 2025-05-20 16:04:20 发布

阅读量3.5k

点赞数 23

CC 4.0 BY-SA版权

文章标签：深度学习机器学习神经网络 cnn 分类

本文链接：https://blog.youkuaiyun.com/zxm214csdn/article/details/130043607

本文介绍了如何使用预训练的VGG16模型进行特征提取和微调，以应对有限图片数据的分类问题。通过在TensorFlow2.0环境中，利用ImageDataGenerator处理数据，分别展示了特征提取和微调模型的构建过程，包括模型训练、编译和评估。同时，文章还涉及批量图片的预测、类激活图的生成以及ROC曲线的绘制，以评估模型性能。最后，讨论了预训练模型在新任务中的适用性和提升分类性能的方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

预训练模型-VGG16模型的构建，批量图片预测、类激活图以及ROC曲线结果

正如之前的文章《卷积神经网络构建，新图片的预测与类激活图——提高CNN模型的可解释性》所说，当图片数据量十分有限时，分类模型CNN的分类性能受到了严重的限制。因此本文中引入了常用的一种方法——使用预训练模型。预训练模型简单来说就是已经在大型数据集上训练好的模型。可能会有人提出疑问，已经训练好的模型可以应用到新的图片分类问题中吗？这就得说说使用预训练模型的基本原理，人们默认当原始图片数据集足够大且通用时，预训练模型从中学到的特征的空间层次结构可以有效地作为视觉世界的通用模版。因此这些特征的结构就可以用于不同的计算机视觉问题，即使新问题与原始问题完全不同。简而言之，就是在原始超大数据集上预训练模型提取到的特征具有可移植性。本文将从两种方法（特征提取和微调模型）介绍VGG16的搭建。并对批量的新图片数据进行预测，得到它们的类激活图与ROC曲线。本文中用到的数据以及代码可以通过以下链接获取：链接：https://pan.baidu.com/s/1MKkFTUAWnm2EyJSrKm5yoA
提取码：2o4n

一、两种预训练的VGG16模型搭建——特征提取与微调模型

特征提取顾名思义就是使用已经训练的复杂网络模型VGG16学到的知识从新样本中提取出新的特征，然后将这些特征输入到新的分类器中，从头开始训练。微调模型是指将用于特征提取的VGG16模型作为基础冻结起来，只对其顶部几层“解冻”，且将“解冻”的这几层和新增加的部分（全连接分类器）联合训练。
使用环境：tensorlfow 2.0, jupyter notebook, python=3.7

1.VGG16用于特征提取

为了使用预训练的VGG16模型，需要提前下载好已经训练好的VGG16模型权重，可在上面已发的链接中获取。VGG16用于提取特征主要有几个步骤：（1）导入已训练的VGG16、（2）输入数据并处理、进行特征提取、（3）模型训练与编译、（4）输出训练结果
1.1 导入已训练的VGG16模型
模型导入的参数主要是VGG16的特征提取功能区：即卷积层和池化层的结构与参数。具体用于分类的全连接层则是需要新的训练。

from tensorflow.keras.applications import VGG16
path='C:/Users/D/python基础学习/VGG16_weights.h5'
conv_base=VGG16(weights=path,include_top=False,input_shape=(150,150,3)) 
conv_base.summary()

1.2 输入数据并进行特征提取

import os #输入数据
from glob import glob
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
 
nb_train_samples =600
nb_validation_samples = 161
nb_test_samples=35
epochs = 50
batch_size =16
num_of_categories=3
import os
#划分数据集
base_dir = 'D:/AM Learning/data/kaggle/3categories'
 
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
 
datagen = ImageDataGenerator(rescale=1./255)
 #提取特征
def extract_features(directory, sample_count):
    features = np.zeros(shape=(sample_count, 4, 4, 512))
    labels = np.zeros(shape=(sample_count,num_of_categories))
    generator = datagen.flow_from_directory(directory,
                                            target_size=(150, 150),
                                            batch_size=batch_size,
                                            class_mode='categorical')
    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)
        features[i * batch_size : (i + 1) * batch_size] = features_batch
        labels[i * batch_size : (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            # Note that since generators yield data indefinitely in a loop,
            # we must `break` after every image has been seen once.
            #注意，这些生成器在循环中不断生成数据，所以你必须在读取完所有图像后终止循环
            break
    return features, labels
 
train_features, train_labels = extract_features(train_dir, nb_train_samples)
validation_features, validation_labels = extract_features(validation_dir, nb_validation_samples)
test_features, test_labels = extract_features(test_dir, nb_test_samples)

#我们要将其输入到密集连接分类器中， 所以首先必须将其形状展平为 (samples, 8192)
train_features = np.reshape(train_features, (nb_train_samples, 4 * 4 * 512))
validation_features = np.reshape(validation_features, (nb_validation_samples, 4 * 4 * 512))
test_features = np.reshape(test_features, (nb_test_samples, 4 * 4 * 512))

1.3 模型训练与编译
该部分是加入全连接层作为新的分类器，进而可以对模型进行新的训练和优化。

from keras import models
from keras import layers
from keras import optimizers
#加入新的全连接层
model = models.Sequential()
model.add(layers.Dense(256, input_dim=4 * 4 * 512))
model.add(layers.BatchNormalization())
model.add(layers.Activation('relu'))
#model.add(layers.Dropout(0.2))
model.add(layers.Dense(3, activation='softmax')) #当是二分类时激活函数是sigmoid,多分类时是softmax
#编译
model.compile(optimizer=optimizers.RMSprop(lr=1e-4),
              loss='categorical_crossentropy',
              metrics=['acc'])
#训练
history = model.fit(train_features, train_labels,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_data=(validation_features, validation_labels))

1.4输出训练结果
可以根据训练结果对不同参数进行调优，保证图片分类模型的准确性。

import matplotlib.pyplot as plt
#调节字体
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc)+1)
plt.plot(epochs, acc, color='steelblue',linestyle='-',marker = 'o',linewidth=2, label = '训练集准确率')
plt.plot(epochs, val_acc, color='cornflowerblue',linewidth=2, linestyle='-.',label = '验证集准确率')
plt.title('准确率随训练轮数的变化',fontsize=16)
plt.tick_params(labelsize=12)
plt.xlabel('训练轮数',fontsize=16)
plt.ylabel('准确率',fontsize=16)
plt.legend()
plt.figure()
plt.plot(epochs, loss, color='tomato',linestyle='-',marker = 'o', linewidth=2,label = '训练集误差')
plt.plot(epochs, val_loss, color='salmon',linewidth=2,linestyle='-.', label = '验证集误差')
plt.xlabel('训练轮数',fontsize=16)
plt.ylabel('误差',fontsize=16)
plt.title('误差随训练轮数的变化',fontsize=16)
plt.tick_params(labelsize=12)
plt.legend()
plt.show()
#保存模型用于预测新的图片类别
model.save('3animals1.h5')

2.微调VGG16模型

微调VGG16模型与将VGG16用于特征提取的差别不大，步骤也相近，这里不再一一赘述。只是需要注意的是，微调VGG16将最后几层卷积层和池化层也加入了可训练的模型中，因此需要指定需要训练的层进行“解冻”。

#输入数据
import os 
from glob import glob

#（通过存储路径导入训练集和验证集）
train_dir = 'D:/AM Learning/data/kaggle/3categories/train'           
test_dir = 'D:/AM Learning/data/kaggle/3categories/validation'

#（给出训练集与验证集的数据量）
nb_train_samples = 600        
nb_validation_samples = 161  

epochs = 50
batch_size =16

#对图片进行处理
from keras.preprocessing.image import ImageDataGenerator 

#（通过数据增强对图片进行变换）
train_datagen = ImageDataGenerator(rescale=1./255,rotation_range =40,
                                   width_shift_range=0.2, height_shift_range=0.2,
                                   shear_range=0.2,zoom_range=0.2,horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)                               #（图片像素标准化）
 
train_generator = train_datagen.flow_from_directory(train_dir,
                                                    target_size=(150, 150),     #（图片大小）
                                                    batch_size=batch_size,
                                                    class_mode='categorical')   #（分类类别）
 
validation_generator = test_datagen.flow_from_directory(test_dir,
                                                      target_size=(150, 150),
                                                      batch_size=batch_size,
                                                      class_mode='categorical')


#使用Model()构建模型,可以使用类激活图
from keras import models 
from keras import layers
from keras import optimizers
from keras.models import Model
from keras.applications import VGG16

#（导入VGG16的权重）
path='C:/Users/D/python基础学习/VGG16_weights.h5'
conv_base=VGG16(weights=path,include_top=False,input_shape=(150,150,3))
#conv_base.trainable=False
mid1_layer=layers.Flatten()
mid1_layer_tensor=mid1_layer(conv_base.output)
mid2_layer=layers.Dense(256, activation='relu')
mid2_layer_tensor=mid2_layer(mid1_layer_tensor)
# mid3_layer=layers.Dropout(0.2)
# mid3_layer_tensor=mid3_layer(mid2_layer_tensor)
output_layer=layers.Dense(3,activation='softmax')
output_layer_tensor=output_layer(mid2_layer_tensor)
model = Model(inputs=conv_base.input,outputs=output_layer_tensor)
conv_base.trainable = True
for layer in model.layers[:-2]:
    layer.trainable = False   
model.compile(loss="categorical_crossentropy",
              optimizer=optimizers.RMSprop(learning_rate=1e-4),
              metrics=["acc"])

model.summary()

from tensorflow.keras import Sequential, layers, utils, losses
utils.plot_model(model)

history = model.fit_generator(train_generator,   #利用批量生成器拟合模型
                              steps_per_epoch=nb_train_samples/batch_size,
                              epochs=epochs,
                              validation_data=validation_generator,
                              validation_steps=nb_validation_samples/batch_size)

import matplotlib.pyplot as plt
#调节字体
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc)+1)
plt.plot(epochs, acc, color='steelblue',linestyle='-',marker = 'o',linewidth=2, label = '训练集准确率')
plt.plot(epochs, val_acc, color='cornflowerblue',linewidth=2, linestyle='-.',label = '验证集准确率')
plt.title('准确率随训练轮数的变化',fontsize=16)
plt.tick_params(labelsize=12)
plt.xlabel('训练轮数',fontsize=16)
plt.ylabel('准确率',fontsize=16)
plt.legend()
plt.figure()
plt.plot(epochs, loss, color='tomato',linestyle='-',marker = 'o', linewidth=2,label = '训练集误差')
plt.plot(epochs, val_loss, color='salmon',linewidth=2,linestyle='-.', label = '验证集误差')
plt.xlabel('训练轮数',fontsize=16)
plt.ylabel('误差',fontsize=16)
plt.title('误差随训练轮数的变化',fontsize=16)
plt.tick_params(labelsize=12)
plt.legend()
plt.show()

#微调模型的最后几层
conv_base.trainable = True
for layer in conv_base.layers[:-2]:
    layer.trainable = False   
model.compile(loss="categorical_crossentropy",
              optimizer=optimizers.RMSprop(learning_rate=1e-4),
              metrics=["acc"])

import matplotlib.pyplot as plt
#调节字体
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc)+1)
plt.plot(epochs, acc, color='steelblue',linestyle='-',marker = 'o',linewidth=2, label = '训练集准确率')
plt.plot(epochs, val_acc, color='cornflowerblue',linewidth=2, linestyle='-.',label = '验证集准确率')
plt.title('准确率随训练轮数的变化',fontsize=16)
plt.tick_params(labelsize=12)
plt.xlabel('训练轮数',fontsize=16)
plt.ylabel('准确率',fontsize=16)
plt.legend()
plt.figure()


plt.plot(epochs, loss, color='tomato',linestyle='-',marker = 'o', linewidth=2,label = '训练集误差')
plt.plot(epochs, val_loss, color='salmon',linewidth=2,linestyle='-.', label = '验证集误差')
plt.xlabel('训练轮数',fontsize=16)
plt.ylabel('误差',fontsize=16)
plt.title('误差随训练轮数的变化',fontsize=16)
plt.tick_params(labelsize=12)
plt.legend()
plt.show()

model.save('3animals0.h5')

二、批量图片预测与类激活图结果

网络上不少图片分类的模型只介绍了模型的搭建和训练，对于将训练好的数据应用在新的图片上分类却介绍有限，而进行批量预测且批量给出类激活图的例子更少，因此作者将在这里简单介绍如何批量给图片分类并给出它们的激活图。本文使用的模型是上面已经介绍的进行微调的VGG16，由于用于特征提取的VGG16卷积层被冻结，顾不适合用于产生的类激活图分析的案例中。
由于篇幅有限，顾只给出关键细节代码，完整代码参见文章开头链接。

test_path = 'D:/AM Learning/data/kaggle/3categories/test' #输入需要预测的图片
image_paths = sorted(list(paths.list_images(test_path)))  #提取所有图片地址
model = keras.models.load_model('3categories0.h5')           #加载模型
for each in image_paths:                              #循环读取图片并预测
    image = cv2.imread(each)
    image = cv2.resize(image,(150,150))
    image = img_to_array(image)/255.0
    image = np.expand_dims(image,axis=0)
    result = model.predict(image)                        #分类预测
    for i in result[0]:
        print('percent:{:.10f}'.format(i))
    proba = np.max(result)                              #提取图片类别的最大概率
    index=np.argmax(result[0])
    from keras import backend as K
    cat_output = model.output[:, index]
 
    # 模型的最后一个卷积层：block5_conv3层的输出特征图。也可以用于一般的CNN模型的批量类激活图输出，只需要替换最后一层卷积层即可
    last_conv_layer = model.get_layer('block5_conv3')
 
    # “图片”类别相对于 block5_conv3输出特征图的梯度
    grads = K.gradients(cat_output, last_conv_layer.output)[0]
 
    # 形状为 (512,) 的向量，每个元素是特定特征图通道的梯度平均大小
    pooled_grads = K.mean(grads, axis=(0, 1, 2))
  
    # 访问刚刚定义的量：对于给定的样本图像，pooled_grads 和 block5_conv3 层的输出特征图
    iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
 
    # 对于样本图像， 这两个量都是 Numpy 数组
    pooled_grads_value, conv_layer_output_value = iterate([image])

    # 将特征图数组的每个通道乘以“这个通道 对类别的重要程度”
    for i in range(conv_layer_output_value.shape[-1]):
        conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
   # 得到的特征图的逐通道平均值即为类激活的热力图
    heatmap = np.mean(conv_layer_output_value, axis=-1)
    import matplotlib.pyplot as plt
    max_heat = np.max(heatmap)
    if max_heat == 0:
        max_heat = 1e-10
    heatmap /= max_heat
    plt.matshow(heatmap)
    plt.show()

可以得到类似于以下图片的结果，这将是后面绘制ROC曲线的基础：由图可以看到，首先会得到一张图片被分类为不同类别的可能性，本文类别共有3类分别是猫，狗、兔子，分别对应0、1、2标签。其次会得到一张类激活图，该图反映了不同特征在分类时被关注的程度。最后就是原图片的原始类别和预测类别及可能性大小。在这里插入图片描述

三、受试者工作特性曲线（ROC曲线）的绘制

受试者工作特性曲线（ROC曲线）是常被用于分类任务的曲线，ROC曲线下的面积（AUC）可用于分类性能评判标准，其中AUC面积为0.5表示随机分类，识别能力为0；面积越接近于1，则说明分类能力越强，面积等于1为完全识别。为了绘制ROC曲线，需要根据上面批量预测的图片结果得到如下图的表格。然后根据表格内容就可以绘制ROC曲线，ROC曲线给出了真正例（TPR）和假正例（FPR）之间的权衡，关于TPR和FPR请参考其他文章。
在这里插入图片描述
绘制ROC曲线部分代码，具体代码在链接处获取：

data=pd.read_excel("D:\AM Learning\data\single_track\模型调优参数.xlsx", sheet_name='Sheet1')
true_y=data['y_real'].to_numpy()
true_y=to_categorical(true_y)
PM_y=data[['0（猫）','狗（1）','兔子（3）']].to_numpy()
PM_y.shape
n_classes=PM_y.shape[1]
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(true_y[:, i], PM_y[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])

from scipy import interp
# 计算所有假正例（FPR）
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))
 
# 对ROC曲线插值
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])
 
# 计算AUC，并绘制曲线
mean_tpr /= n_classes
 
fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])

#绘制ROC曲线
lw = 2
# Plot all ROC curves
plt.figure()
labels=['类别 0','类别 1','类别 2']
plt.plot(fpr["macro"], tpr["macro"],
         label='平均ROC曲线 (面积 = {0:0.4f})'
               ''.format(roc_auc["macro"]),
         color='navy', linestyle=':', linewidth=4)
 
colors = cycle(['aqua', 'darkorange','blue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=lw,
             label=labels[i]+'(面积 = {0:0.4f})'.format(roc_auc[i]))

最终可以得到类似于下图的ROC曲线：
在这里插入图片描述

结论

至此，文章结束。全文以猫、狗、兔子的3种图片分类作为示例，对预训练网络模型VGG16的搭建、训练、测试以及结果表征进行了简单的介绍，由于本文主要在于介绍整体思路，并未严格要求分类的准确率，因此预测准确率还有待提高。且由于作者水平有限，不免会出现不少错误，欢迎大家讨论并指正。文章参考《python深度学习》与部分网友的思路，若侵权请联系本人。