08_实践:使用Sequential Model搭建VGG16进行猫狗识别

本文介绍如何使用VGG16模型进行猫狗分类任务。首先搭建VGG16模型,随后利用Keras进行数据增强及预处理,接着训练模型并保存,最后测试模型精度达91.5%。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1.搭建VGG16模型

本部分搭建一个通用的VGG16模型,本部分参考了:

我觉得学习Keras官方代码中构建模型的样例是很有必要的,阅读源码不仅可以学到框架的使用方法,其中的源代码作者的编程风格和编程规范也是很值得去学习领悟的。

首先导入需要用的函数:

from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.optimizers import *
from tensorflow.keras.utils import *
import os

预训练好的权重文件的下载链接,如果使用代码不好下载可以在迅雷中输入下面的链接下载,这样会快很多:


VGG16_WEIGHTS_PATH = ('https://github.com/fchollet/deep-learning-models/'
                'releases/download/v0.1/'
                'vgg16_weights_tf_dim_ordering_tf_kernels.h5')
VGG16_WEIGHTS_PATH_NO_TOP = ('https://github.com/fchollet/deep-learning-models/'
                       'releases/download/v0.1/'
                       'vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5')

定义生成VGG16模型的方法:

def vgg16(include_top=True,weights='imagenet',
                    input_shape=None,pooling=None,classes=1000):
    """使用tf.keras sequential model构建VGG16模型

    # Arguments
        include_top:是否包含网络最后的3层全连接层,默认为包含。
        weights:选择预训练权重,默认为'imagenet',可选'None'为随机初始化权重或者其他权重的路径。
        input_shape:输入的尺寸,应该是一个元组,当include_top设为True时默认为(224,224,3),否则应当被
                    定制化,因为输入图像的尺寸会影响到全连接层参数的个数。
        pooling:指定池化方式。
        classes:类别数量。
    # Returns
        返回一个tf.keras sequential model实例。
    # Raises
        ValueError:由于不合法的参数会导致相应的异常。
    """

    # 检测weights参数是否合法,如果不合法会引发异常
    if not(weights in {'imagenet',None} or os.path.exists(weights)):
        raise ValueError("the input of weights is not valid")

    # 检测include_top和classes是否冲突,如果不合法会引发异常
    if weights=='imagenet' and include_top and classes!=1000:
        raise ValueError("if using weights='imagenet' and include_top=True,classes should be 1000.")
	
	# 开始定义模型
    model = Sequential()

    # Block 1
    model.add(Conv2D(input_shape=input_shape,filters=64,kernel_size=3,
                            strides=1,padding='same',activation='relu',name='block1_conv1'))
    model.add(Conv2D(64,3,strides=1,padding='same',activation='relu',name='block1_conv2'))
    model.add(MaxPooling2D(2,2,'same',name='block1_maxpool'))

    # Block 2
    model.add(Conv2D(128,3,strides=1,padding='same',activation='relu',name='block2_conv1'))
    model.add(Conv2D(128,3,strides=1,padding='same',activation='relu',name='block2_conv2'))
    model.add(MaxPooling2D(2,2,'same',name='block2_maxpool'))

    # Block 3
    model.add(Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv1'))
    model.add(Conv2D(256,3,strides=1,padding='same',activation='relu',name='block3_conv2'))
    model.add(Conv2D(256,3,strides=2,padding='same',activation='relu',name='block3_conv3'))
    model.add(MaxPooling2D(2,2,'same',name='block3_maxpool'))

    # Block 4
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv1'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv2'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block4_conv3'))
    model.add(MaxPooling2D(2,2,'same',name='block4_maxpool'))

    # Block 5
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv1'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv2'))
    model.add(Conv2D(512,3,strides=1,padding='same',activation='relu',name='block5_conv3'))
    model.add(MaxPooling2D(2,2,'same',name='block5_maxpool'))

    if include_top: #包含默认的全连接层
        model.add(Flatten(name='flatten'))
        model.add(Dense(4096,activation='relu',name='fc_layer1'))
        model.add(Dense(4096,activation='relu',name='fc_layer2'))
        model.add(Dense(classes,activation='softmax',name='predictions_layer'))
    else:
        if pooling == 'avg':
            model.add(GlobalAveragePooling2D())
        elif pooling == 'max':
            model.add(GlobalMaxPooling2D())

    # 加载权重
    if weights == 'imagenet':# 从Github下载权重
        if include_top:
            weights_path = get_file(
                'vgg16_weights_tf_dim_ordering_tf_kernels.h5',
                VGG16_WEIGHTS_PATH,
                cache_subdir='models',
                file_hash='64373286793e3c8b2b4e3219cbf3544b')
        else:
            weights_path = get_file(
                'vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
                VGG16_WEIGHTS_PATH_NO_TOP,
                cache_subdir='models',
                file_hash='6d6bbae143d832006294945121d1f1fc')
        model.load_weights(weights_path)
        print("Loading weigths from "+weights_path+" finished!")
    elif weights is not None: # 从指定路径加载权重
        model.load_weights(weights)
        print("Loading weigths from "+weights+" finished!")
	
	# 返回定义好的模型
    return model

函数解析:

  • include_top可以决定返回的模型是只有卷积层还是包括卷积层和全连接层。
  • weights参数可以决定从哪里加载好预先训练好的权重文件,设置None则从头开始训练。
  • input_shape设置模型输入尺寸,是一个元组。
  • pooling设置全局池化的方法,这个只在include_topFalse的情况下有用。
  • classes分类的类别数。
  • 后面训练的时候include_top设为False去除全连接层,从而只保留卷积层,然后自己构建全连接层加到VGG16卷积层后面去。

本部分完整代码:
vgg.py

2.训练猫狗识别模型

2.1 数据预处理

# 处理数据
train_datagen = ImageDataGenerator(
    rotation_range = 40,width_shift_range = 0.2,height_shift_range = 0.2, rescale = 1/255,shear_range = 20,
    zoom_range = 0.2,horizontal_flip = True,fill_mode = 'nearest',) 
test_datagen = ImageDataGenerator(rescale = 1/255,) # 数据归一化 

batch_size = 32

# train_data
train_generator = train_datagen.flow_from_directory(
    '01_tf_keras/sequential_model/data/cat_vs_dog/train',
    target_size=(150,150),
    batch_size=batch_size)

# test_data
test_generator = test_datagen.flow_from_directory(
    '01_tf_keras/sequential_model/data/cat_vs_dog/test',
    target_size=(150,150),
    batch_size=batch_size )

2.2 导入模型

# 导入模型
from tensorflow.keras.models import *
from tensorflow.keras.layers import *
from tensorflow.keras.optimizers import *
from tensorflow.keras.preprocessing.image import *

from vgg import vgg16

# 指定权重路径从本地加载权重
weights_path = '01_tf_keras/sequential_model/weights/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'

# include_top设为False
model = vgg16(weights=weights_path,
                include_top=False,input_shape=(150,150,3),classes=2)

# 自定义全连接层
model.add(Flatten(input_shape=model.output_shape[1:],name='flatten'))
model.add(Dense(256,activation='relu',name='fc_layer1'))
model.add(Dropout(0.3,name='dropout1'))
model.add(Dense(2,activation='softmax',name='predictions_layer'))
model.summary()

2.3训练并保存模型:

# train
model.compile(optimizer=SGD(lr=0.001,momentum=0.9),loss='categorical_crossentropy',metrics=['accuracy'])
model.fit_generator(train_generator,steps_per_epoch=len(train_generator),
                    epochs=50,validation_data=test_generator,
                    validation_steps=len(test_generator))

model.save('01_tf_keras/sequential_model/weights/model_vgg16.h5')

迭代了50次之后准确率达到了91.5%。

本部分完整代码:
07_fintune_vgg16_cat_vs_dog.py

3.测试训练好的模型

使用方法:

python 08_predict_cat_or_dog.py -i 图片路径

导入load_model方法来加载之前保存的模型:

import argparse as ap
import numpy as np
import cv2 as cv

from tensorflow.keras.preprocessing.image import *
from tensorflow.keras.models import load_model

加载模型,使用opencv来读取图片,并将读入的图片处理成模型需要的输入格式:

label = np.array(['猫','狗'])
model = load_model('/home/peco/Desktop/Learn_TensorFlow2.0/01_tf_keras/sequential_model/weights/model_vgg16.h5')

def pred(img):
    image = load_img(img)
    cv.imshow("input",cv.cvtColor(np.asarray(image),cv.COLOR_RGB2BGR))
    image = image.resize((150,150))
    image = img_to_array(image)
    image = image/255
    image = np.expand_dims(image,0)
    image.shape
    print(label[model.predict_classes(image)])
    cv.waitKey(0)

pred(args.image_path)

结果检测:
在这里插入图片描述
在这里插入图片描述

好的,以下是使用Pytorch框架编写的VGG16猫狗识别模型的代码。 首先,需要导入必要的库和模块。 ```python import torch import torch.nn as nn import torch.optim as optim import torch.utils.data as data import torchvision from torchvision import transforms from torch.utils.data import DataLoader import matplotlib.pyplot as plt ``` 接下来,定义网络结构。这里我们使用VGG16作为基础模型,在最后添加一个全连接层用于二分类。 ```python class VGG16(nn.Module): def __init__(self): super(VGG16, self).__init__() self.features = nn.Sequential( nn.Conv2d(3, 64, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(64, 64, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=2, stride=2), nn.Conv2d(64, 128, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(128, 128, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=2, stride=2), nn.Conv2d(128, 256, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(256, 256, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(256, 256, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=2, stride=2), nn.Conv2d(256, 512, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=2, stride=2), nn.Conv2d(512, 512, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=2, stride=2), ) self.classifier = nn.Sequential( nn.Linear(512 * 7 * 7, 4096), nn.ReLU(inplace=True), nn.Dropout(p=0.5), nn.Linear(4096, 4096), nn.ReLU(inplace=True), nn.Dropout(p=0.5), nn.Linear(4096, 2), ) def forward(self, x): x = self.features(x) x = x.view(x.size(0), -1) x = self.classifier(x) return x ``` 然后,定义训练函数。 ```python def train(model, train_loader, criterion, optimizer): model.train() train_loss = 0.0 train_acc = 0.0 for i, (inputs, labels) in enumerate(train_loader): inputs = inputs.cuda() labels = labels.cuda() optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() train_loss += loss.item() * inputs.size(0) _, preds = torch.max(outputs, 1) train_acc += torch.sum(preds == labels.data) train_loss = train_loss / len(train_loader.dataset) train_acc = train_acc.double() / len(train_loader.dataset) return train_loss, train_acc ``` 定义测试函数。 ```python def test(model, test_loader, criterion): model.eval() test_loss = 0.0 test_acc = 0.0 with torch.no_grad(): for inputs, labels in test_loader: inputs = inputs.cuda() labels = labels.cuda() outputs = model(inputs) loss = criterion(outputs, labels) test_loss += loss.item() * inputs.size(0) _, preds = torch.max(outputs, 1) test_acc += torch.sum(preds == labels.data) test_loss = test_loss / len(test_loader.dataset) test_acc = test_acc.double() / len(test_loader.dataset) return test_loss, test_acc ``` 接下来,加载数据集。 ```python train_transforms = transforms.Compose([ transforms.Resize((224, 224)), transforms.RandomHorizontalFlip(), transforms.RandomRotation(10), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]) test_transforms = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]) train_set = torchvision.datasets.ImageFolder(root='train/', transform=train_transforms) test_set = torchvision.datasets.ImageFolder(root='test/', transform=test_transforms) train_loader = DataLoader(train_set, batch_size=32, shuffle=True) test_loader = DataLoader(test_set, batch_size=32, shuffle=False) ``` 然后,定义损失函数和优化器。 ```python criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9) ``` 最后,训练模型并打印出其精度和损失函数图像。 ```python num_epochs = 20 model = VGG16().cuda() train_loss_list = [] train_acc_list = [] test_loss_list = [] test_acc_list = [] for epoch in range(num_epochs): train_loss, train_acc = train(model, train_loader, criterion, optimizer) test_loss, test_acc = test(model, test_loader, criterion) train_loss_list.append(train_loss) train_acc_list.append(train_acc) test_loss_list.append(test_loss) test_acc_list.append(test_acc) print("Epoch [{}/{}] Train Loss: {:.4f}, Train Acc: {:.4f}, Test Loss: {:.4f}, Test Acc: {:.4f}".format( epoch + 1, num_epochs, train_loss, train_acc, test_loss, test_acc)) plt.title("Train/Test Loss") plt.plot(train_loss_list, label="Train") plt.plot(test_loss_list, label="Test") plt.legend(loc="upper right") plt.show() plt.title("Train/Test Accuracy") plt.plot(train_acc_list, label="Train") plt.plot(test_acc_list, label="Test") plt.legend(loc="lower right") plt.show() ``` 运行以上代码即可在GPU上训练VGG16猫狗识别模型并打印出其精度和损失函数图像。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值