CNN图片分类(Pytorch)

这篇文章主要讲述用 pytorch 完成简单 CNN 图片分类任务,如果想对 CNN 的理论知识进行了解,可以看我的这篇文章,深度学习(一)——CNN卷积神经网络

图片分类

我们以美食图片分类为例,有testingtrainingvalidation文件夹。下载链接放下面。
点击提取, 提取码:nefu

前面的 0 表示其为 0 类,后面为其编号。


导入必要的包

# Import需要的套件
import os
import numpy as np
import cv2
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import pandas as pd
from torch.utils.data import DataLoader, Dataset
import time

cv2 我是通过如下命令下载

pip install opencv-python

torch 我下载的是 cuda10.2 的版本,这里就简单放一下下载 pytorch 的代码,至于如何使用 GPU 加速,可以上网查查。

pip3 install torch==1.10.0+cu102 torchvision==0.11.1+cu102 torchaudio===0.10.0+cu102 -f https://download.pytorch.org/whl/cu102/torch_stable.html

读取数据

把训练集、验证集和测试集读取进来,放入 numpy 数组。 x 为其图片的像素张量,y 为其标签。

# Read image 利用 OpenCV(cv2) 读入照片并存放在 numpy array 中
def readfile(path, label):
    # label 是一个 boolean variable, 代表需不需要回传 y 值
    image_dir = sorted(os.listdir(path))  # os.listdir(path)将path路径下的文件名以列表形式读出
    # print(os.listdir(path))
    # print(image_dir)
    x = np.zeros((len(image_dir), 128, 128, 3), dtype=np.uint8)
    y = np.zeros((len(image_dir)), dtype=np.uint8)
    for i, file in enumerate(image_dir):
        img = cv2.imread(os.path.join(path, file))  # os.path.join(path, file) 路径名合并
        x[i, :, :] = cv2.resize(img, (128, 128))
        if label:
            y[i] = int(file.split("_")[0])
    if label:
        return x, y
    else:
        return x


# 分别将 training set、validation set、testing set 用 readfile 函式读进来
workspace_dir = './food-11'
print("Reading data")
print("...")
train_x, train_y = readfile(os.path.join(workspace_dir, "training"), True)
# print("Size of training data = {}".format(len(train_x)))
val_x, val_y = readfile(os.path.join(workspace_dir, "validation"), True)
# print("Size of validation data = {}".format(len(val_x)
test_x = readfile(os.path.join(workspace_dir, "testing"), False)
# print("Size of Testing data = {}".format(len(test_x)))
print("Reading data complicated")

数据处理

定义数据增强操作(随机翻转、随机旋转),定义 batch 的大小。

''' Dataset '''
print("Dataset")
print("...")
# training 时做 data augmentation
# transforms.Compose 将图像操作串联起来
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomHorizontalFlip(),  # 随机将图片水平翻转
    transforms.RandomRotation(15),  # 随机旋转图片 (-15,15)
    transforms.ToTensor(),  # 将图片转成 Tensor, 并把数值normalize到[0,1](data normalization)
])
# testing 时不需做 data augmentation
test_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
])


class ImgDataset(Dataset):
    def __init__(self, x, y=None, transform=None):
        self.x = x
        # label is required to be a LongTensor
        self.y = y
        if y is not None:
            self.y = torch.LongTensor(y)
        self.transform = transform

    def __len__(self):
        return len(self.x)

    def __getitem__(self, index):
        X = self.x[index]
        if self.transform is not None:
            X = self.transform(X)
        if self.y is not None:
            Y = self.y[index]
            return X, Y
        else:  # 如果没有标签那么只返回X
            return X


batch_size = 32
train_set = ImgDataset(train_x, train_y, train_transform)
val_set = ImgDataset(val_x, val_y, test_transform)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_set, batch_size=batch_size, shuffle=False)
print("Dataset complicated")

模型结构

定义CNN的结构。

''' Model '''
print("Model")
print("...")


class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        # torch.nn.MaxPool2d(kernel_size, stride, padding)
        # input 维度 [3, 128, 128]
        self.cnn = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),  # [64, 128, 128]
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),  # [64, 64, 64]

            nn.Conv2d(64, 128, 3, 1, 1),  # [128, 64, 64]
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),  # [128, 32, 32]

            nn.Conv2d(128, 256, 3, 1, 1),  # [256, 32, 32]
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),  # [256, 16, 16]

            nn.Conv2d(256, 512, 3, 1, 1),  # [512, 16, 16]
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),  # [512, 8, 8]

            nn.Conv2d(512, 512, 3, 1, 1),  # [512, 8, 8]
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),  # [512, 4, 4]
        )
        self.fc = nn.Sequential(
            nn.Linear(512 * 4 * 4, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 11)
        )

    def forward(self, x):
        out = self.cnn(x)
        out = out.view(out.size()[0], -1)
        return self.fc(out)


print("Model complicated")

训练模型

对模型进行训练,迭代30次,并用验证集测试,最后将训练集和验证集合并在进行训练。

''' Training '''
print("Training")
print("...")
# 使用training set訓練,並使用validation set尋找好的參數
model = Classifier().cuda()
loss = nn.CrossEntropyLoss()  # 因為是 classification task,所以 loss 使用 CrossEntropyLoss
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)  # optimizer 使用 Adam
num_epoch = 30  # 迭代30次

for epoch in range(num_epoch):
    epoch_start_time = time.time()
    train_acc = 0.0
    train_loss = 0.0
    val_acc = 0.0
    val_loss = 0.0

    model.train()  # 確保 model 是在 train model (開啟 Dropout 等...)
    for i, data in enumerate(train_loader):
        optimizer.zero_grad()  # 用 optimizer 將 model 參數的 gradient 歸零
        train_pred = model(data[0].cuda())  # 利用 model 得到預測的機率分佈 這邊實際上就是去呼叫 model 的 forward 函數
        batch_loss = loss(train_pred, data[1].cuda())  # 計算 loss (注意 prediction 跟 label 必須同時在 CPU 或是 GPU 上)
        batch_loss.backward()  # 利用 back propagation 算出每個參數的 gradient
        optimizer.step()  # 以 optimizer 用 gradient 更新參數值

        train_acc += np.sum(np.argmax(train_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
        train_loss += batch_loss.item()

    model.eval()
    with torch.no_grad():
        for i, data in enumerate(val_loader):
            val_pred = model(data[0].cuda())
            batch_loss = loss(val_pred, data[1].cuda())

            val_acc += np.sum(np.argmax(val_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
            val_loss += batch_loss.item()

        # 將結果 print 出來
        print('[%03d/%03d] %2.2f sec(s) Train Acc: %3.6f Loss: %3.6f | Val Acc: %3.6f loss: %3.6f' % \
              (epoch + 1, num_epoch, time.time() - epoch_start_time, \
               train_acc / train_set.__len__(), train_loss / train_set.__len__(), val_acc / val_set.__len__(),
               val_loss / val_set.__len__()))

train_val_x = np.concatenate((train_x, val_x), axis=0)
train_val_y = np.concatenate((train_y, val_y), axis=0)
train_val_set = ImgDataset(train_val_x, train_val_y, train_transform)
train_val_loader = DataLoader(train_val_set, batch_size=batch_size, shuffle=True)

model_best = Classifier().cuda()
loss = nn.CrossEntropyLoss()  # 因為是 classification task,所以 loss 使用 CrossEntropyLoss
optimizer = torch.optim.Adam(model_best.parameters(), lr=0.001)  # optimizer 使用 Adam
num_epoch = 30

for epoch in range(num_epoch):
    epoch_start_time = time.time()
    train_acc = 0.0
    train_loss = 0.0

    model_best.train()
    for i, data in enumerate(train_val_loader):
        optimizer.zero_grad()
        train_pred = model_best(data[0].cuda())
        batch_loss = loss(train_pred, data[1].cuda())
        batch_loss.backward()
        optimizer.step()

        train_acc += np.sum(np.argmax(train_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
        train_loss += batch_loss.item()

        # 將結果 print 出來
    print('[%03d/%03d] %2.2f sec(s) Train Acc: %3.6f Loss: %3.6f' % \
          (epoch + 1, num_epoch, time.time() - epoch_start_time, \
           train_acc / train_val_set.__len__(), train_loss / train_val_set.__len__()))

print("Training complicated")

Output:

[001/030] 70.94 sec(s) Train Acc: 0.260997 Loss: 0.065946 | Val Acc: 0.303499 loss: 0.060955
[002/030] 56.79 sec(s) Train Acc: 0.362051 Loss: 0.057194 | Val Acc: 0.372595 loss: 0.057390
[003/030] 57.03 sec(s) Train Acc: 0.409588 Loss: 0.053193 | Val Acc: 0.395335 loss: 0.054268
[004/030] 57.81 sec(s) Train Acc: 0.455504 Loss: 0.049251 | Val Acc: 0.454519 loss: 0.048942
[005/030] 57.97 sec(s) Train Acc: 0.499899 Loss: 0.045356 | Val Acc: 0.455977 loss: 0.051678
[006/030] 58.28 sec(s) Train Acc: 0.535982 Loss: 0.042452 | Val Acc: 0.378717 loss: 0.074250
[007/030] 59.17 sec(s) Train Acc: 0.553720 Loss: 0.040124 | Val Acc: 0.568513 loss: 0.039936
[008/030] 59.99 sec(s) Train Acc: 0.579769 Loss: 0.038209 | Val Acc: 0.556268 loss: 0.041599
[009/030] 59.79 sec(s) Train Acc: 0.596392 Loss: 0.036224 | Val Acc: 0.502332 loss: 0.045684
[010/030] 60.00 sec(s) Train Acc: 0.618690 Loss: 0.033986 | Val Acc: 0.552770 loss: 0.043603
[011/030] 60.34 sec(s) Train Acc: 0.640482 Loss: 0.032332 | Val Acc: 0.569679 loss: 0.040512
[012/030] 60.72 sec(s) Train Acc: 0.664403 Loss: 0.030329 | Val Acc: 0.537609 loss: 0.047755
[013/030] 59.88 sec(s) Train Acc: 0.685181 Loss: 0.028571 | Val Acc: 0.534111 loss: 0.045569
[014/030] 60.37 sec(s) Train Acc: 0.690249 Loss: 0.028063 | Val Acc: 0.612828 loss: 0.037240
[015/030] 60.48 sec(s) Train Acc: 0.709811 Loss: 0.026130 | Val Acc: 0.649563 loss: 0.034160
[016/030] 60.54 sec(s) Train Acc: 0.720961 Loss: 0.025035 | Val Acc: 0.641691 loss: 0.034600
[017/030] 60.84 sec(s) Train Acc: 0.738901 Loss: 0.023416 | Val Acc: 0.635277 loss: 0.034863
[018/030] 60.48 sec(s) Train Acc: 0.757551 Loss: 0.022210 | Val Acc: 0.616035 loss: 0.039769
[019/030] 59.98 sec(s) Train Acc: 0.774073 Loss: 0.020678 | Val Acc: 0.649271 loss: 0.035323
[020/030] 60.62 sec(s) Train Acc: 0.779343 Loss: 0.019825 | Val Acc: 0.662099 loss: 0.033701
[021/030] 60.04 sec(s) Train Acc: 0.790999 Loss: 0.018790 | Val Acc: 0.682216 loss: 0.032581
[022/030] 60.72 sec(s) Train Acc: 0.800426 Loss: 0.017761 | Val Acc: 0.620408 loss: 0.041586
[023/030] 60.28 sec(s) Train Acc: 0.810156 Loss: 0.016732 | Val Acc: 0.674344 loss: 0.036074
[024/030] 60.17 sec(s) Train Acc: 0.825461 Loss: 0.015653 | Val Acc: 0.649271 loss: 0.039717
[025/030] 59.96 sec(s) Train Acc: 0.838638 Loss: 0.014406 | Val Acc: 0.639067 loss: 0.041005
[026/030] 58.78 sec(s) Train Acc: 0.842793 Loss: 0.014155 | Val Acc: 0.657434 loss: 0.040948
[027/030] 60.47 sec(s) Train Acc: 0.854247 Loss: 0.013192 | Val Acc: 0.664140 loss: 0.042358
[028/030] 59.34 sec(s) Train Acc: 0.861443 Loss: 0.012012 | Val Acc: 0.687755 loss: 0.038089
[029/030] 59.39 sec(s) Train Acc: 0.876748 Loss: 0.010853 | Val Acc: 0.676385 loss: 0.038813
[030/030] 59.35 sec(s) Train Acc: 0.882222 Loss: 0.010558 | Val Acc: 0.648105 loss: 0.043327

测试

对测试集进行预测

''' Testing '''
print("Testing")
print("...")
test_set = ImgDataset(test_x, transform=test_transform)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)
model_best.eval()
prediction = []
with torch.no_grad():
    for i, data in enumerate(test_loader):
        test_pred = model_best(data.cuda())
        test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
        for y in test_label:
            prediction.append(y)
# 將結果寫入 csv 檔
with open("predict.csv", 'w') as f:
    f.write('Id,Category\n')
    for i, y in enumerate(prediction):
        f.write('{},{}\n'.format(i, y))
print("Testing complicated")
CNN图像分类pytorch是使用pytorch框架实现的一种卷积神经网络(CNN)模型,用于对图像进行分类任务。CNN模型在图像处理中被广泛应用,可以通过学习图像的特征来进行分类。引用[1]中提到了一篇关于CNN理论知识的文章,可以通过阅读该文章来了解CNN的原理和运作方式。引用中提到了一个使用pytorch搭建的CNN LSTM Attention网络项目代码,适合初学者学习。而引用中给出了安装pytorch的代码,包括torch和torchvision的版本和下载链接。 所以,CNN图像分类pytorch是指使用pytorch框架实现的一种卷积神经网络模型,可用于对图像进行分类任务。如果你对CNN的理论知识感兴趣,可以阅读引用中提到的文章。如果你想学习搭建CNN LSTM Attention网络,可以参考引用中的项目代码。而要安装pytorch框架,可以使用引用中提供的代码进行安装。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *3* [CNN图片分类(Pytorch)](https://blog.youkuaiyun.com/weixin_44491423/article/details/121892838)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* [pytorch搭建CNN+LSTM+Attention网络实现行车速度预测项目代码加数据](https://download.youkuaiyun.com/download/2301_79009758/88247134)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论 36
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值