Pytorch建立卷积神经网络进行图片分类

最新推荐文章于 2025-03-12 21:22:20 发布

依旧seven

最新推荐文章于 2025-03-12 21:22:20 发布

阅读量2.1k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： pytorch

本文链接：https://blog.youkuaiyun.com/seven08290/article/details/93618809

pytorch 专栏收录该内容

5 篇文章

订阅专栏

本文详细介绍了卷积神经网络(Convolutional Neural Network, CNN)的构造方法，包括参数设置如输入通道数、输出通道数、卷积核大小等。同时，展示了如何从本地读取图像数据并进行预处理，适配CNN模型输入，还涉及了数据增强、维度变换及one-hot编码等关键步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

def __init__(self, in_channels, out_channels, kernel_size, stride=1,
             padding=0, dilation=1, groups=1, bias=True):
    kernel_size = _pair(kernel_size)
    stride = _pair(stride)
    padding = _pair(padding)
    dilation = _pair(dilation)
    super(Conv2d, self).__init__(
        in_channels, out_channels, kernel_size, stride, padding, dilation,
        False, _pair(0), groups, bias)

参数：

in_channels: 输入数据的通道数

out_channels: 输出数据的通道数

kernel_size:卷积核的尺寸

stride:步长

padding: 补零操作

dilation : kernel之间的距离

groups: 决定in_channels卷积核分为几组

bias: 偏置

def load_img(imgPath):
    file = []
    for i in range(1,10001):
        filename = imgPath+str(i)+".png"
        img = cv2.imread(filename)
        file.append(img)
        if i%500 == 0:
            print("已读取%d个样本。" % (i))
    file = np.array(file)
    return file

这块是读取本地的图片，只读取了前一万张。因为图片就命名就是1.png,2.png....所以循环取出，放到file中返回。

导入cv2，直接用imread读取，需要注意的是imread读取的维度是[32,32,3]，后面是想把图片送到CNN里面。所以用了permute转换了一下维度。顺便转成了tensor。permute(0,3,2,1)的意思是第3维到第1维上去.... 0，3，2，1代表原来的第几维度

这里介绍了几种维度的变换函数：https://www.cnblogs.com/yifdu25/p/9399047.html

x_train_pre = torch.FloatTensor(x_train_pre).permute(0,3,2,1) # num,3,32,32
x_test_pre = torch.FloatTensor(x_test_pre).permute(0,3,2,1)

在迭代计算损失的时候，转成one-hot编码，用MSE做损失。

# label变成one-hot编码
class_num = 10
lb = LabelBinarizer().fit(np.array(range(class_num)))
y_train = lb.transform(y_train)
y_test = lb.transform(y_test)

但是想试一下交叉熵，没有转one-hot。需要特别注意的是，交叉熵的input必须是FloatTensor，其对应的numpy应该是float32.

而target则应该是LongTensor，对应的numpy是int64。

这里介绍的交叉熵：https://blog.youkuaiyun.com/geter_CS/article/details/79849386

y_train = torch.Tensor.long(torch.from_numpy(y_train.astype(np.float)))

最后，使用了GPU加速，在送入测试数据的时候，需要注意把out_test拿回来

if torch.cuda.is_available():
    x_test_pre = x_test_pre.cuda()
out_test = cnn_model(x_test_pre)
out_test = out_test.cuda().data.cpu()
pre_y = torch.max(out_test,1)[1].data.numpy().squeeze()
accuracy = (y_test == pre_y).sum() / len(y_test)
print('Epoch:', epoch, '|Step:', step, '|Accuracy:', accuracy)