用pytorch搭建简单CNN前馈网络处理三通道彩色图像

最新推荐文章于 2024-04-22 22:00:32 发布

volareneo

最新推荐文章于 2024-04-22 22:00:32 发布

阅读量2.2k

点赞数

文章标签： pytorch

本文链接：https://blog.youkuaiyun.com/ng323/article/details/108789551

版权

本文档介绍了如何使用PyTorch构建简单的CNN前馈网络来处理三通道彩色图像。首先，展示了如何自定义4x4的三个卷积核，并配合ReLU激活和池化操作。接着，对比了使用torch默认初始化卷积核的情况，由于每次卷积核的随机初始化，图像处理结果会有所不同。最后，添加了全连接层，但未包含softmax层，注意在连接全连接层前需进行数据展平。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

最近想重新学习简单CNN网络前向传播的搭建，发现网上好像没有用pytorch对三通道彩色图像进行处理的。

1.自定义初始化4*4的三个卷积核，然后经过各一次的relu激活和池化，pytorch中，处理图片必须一个batch一个batch的操作，所以我们要准备的数据的格式是 [batch_size, n_channels, hight, width]，记录一下代码：

import cv2
import matplotlib.pyplot as plt


img_path = '1.jpg'

bgr_img = cv2.imread(img_path)

# bgr_img = bgr_img.transpose(2,0,1)
print(bgr_img.shape)
b,g,r = cv2.split(bgr_img)
img_rgb = cv2.merge([r,g,b])
gray_img = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
img_rgb1 = img_rgb.transpose(2,0,1)#将通道数放在最前面的位置
print(img_rgb.shape)


plt.figure()
plt.imshow(gray_img)
print(gray_img.shape)
# Normalise
gray_img = gray_img.astype("float32")/255
# print(gray_img.shape)
plt.figure()
plt.imshow(gray_img)
plt.figure()
plt.imshow(img_rgb)

# plt.show()

import numpy as np

filter_vals = np.array([[-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1], [-1, -1, 1, 1]],dtype=np.float64)
filter_vals=filter_vals
print('Filter shape: ', filter_vals.shape)
filter_1 = filter_vals
filter_2 = -filter_1
filter_3 = filter_1.T
# filter_4 = -filter_3
filters = np.array([filter_1, filter_2, filter_3])#, filter_4])
import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self,weight):
        super(Net, self).__init__()
        # initializes the weights of the convolutional layer to be the weights of the 4 defined filters
        # k_height, k_width = weight.shape[2:]
        # assumes there are 4 grayscale filters
        self.conv = nn.Conv2d(111,111,4, bias=False)
        self.conv.weight = torch.nn.Parameter(weight)
        # define a pooling layer
        self.pool = nn.MaxPool2d(2, 2)

    def forward(self, x):
        # calculates the output of a convolutional layer
        # pre- and post-activation
        conv_x = self.conv(x)
        activated_x = F.relu(conv_x)
        print('activated_x-shape',activated_x.shape)
        # applies pooling layer
        pooled_x = self.pool(activated_x)

        # returns all layers
        return conv_x, activated_x, pooled_x


# instantiate the model and set the weights
weight = torch.from_numpy(filters).unsqueeze(0)#.type(torch.FloatTensor)
print("weight shape",weight.shape)
model = Net(weight)

# print out the layer in the network
print(model)


def viz_layer(layer, n_filters=1):
    fig = plt.figure(figsize=(20, 20))
    for i in range(n_filters):
        ax = fig.add_subplot(1, n_filters, i + 1)
        ax.imshow(np.squeeze(layer[0, i].data.numpy()))
        ax.set_title('Output %s' % str(i + 1))


# plt.imshow(gray_img, cmap='gray')

fig = plt.figure(figsize=(12, 6))
fig.subplots_adjust(left=0, right=1.5, bottom=0.8, top=1, hspace=0.05, wspace=0.05)
for i in range(3):
    ax = fig.add_subplot