Residual Network - Python

本文详细介绍了ResNet-50模型的网络结构及其实现过程,包括使用了哪些卷积块、身份块以及各阶段的具体参数设置。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

实现的network结构如下:

residual network.png

具体参数介绍如下:

The details of this ResNet-50 model are:
Zero-padding pads the input with a pad of (3,3)
Stage 1:
The 2D Convolution has 64 filters of shape (7,7) and uses a stride of (2,2). Its name is "conv1".
BatchNorm is applied to the channels axis of the input.
MaxPooling uses a (3,3) window and a (2,2) stride.
Stage 2:
The convolutional block uses three set of filters of size [64,64,256], "f" is 3, "s" is 1 and the block is "a".
The 2 identity blocks use three set of filters of size [64,64,256], "f" is 3 and the blocks are "b" and "c".
Stage 3:
The convolutional block uses three set of filters of size [128,128,512], "f" is 3, "s" is 2 and the block is "a".
The 3 identity blocks use three set of filters of size [128,128,512], "f" is 3 and the blocks are "b", "c" and "d".
Stage 4:
The convolutional block uses three set of filters of size [256, 256, 1024], "f" is 3, "s" is 2 and the block is "a".
The 5 identity blocks use three set of filters of size [256, 256, 1024], "f" is 3 and the blocks are "b", "c", "d", "e" and "f".
Stage 5:
The convolutional block uses three set of filters of size [512, 512, 2048], "f" is 3, "s" is 2 and the block is "a".
The 2 identity blocks use three set of filters of size [512, 512, 2048], "f" is 3 and the blocks are "b" and "c".
The 2D Average Pooling uses a window of shape (2,2) and its name is "avg_pool".
The flatten doesn't have any hyperparameters or name.
The Fully Connected (Dense) layer reduces its input to the number of classes using a softmax activation. Its name should be 'fc' + str(classes).

实现过程如下:

def ResNet50(input_shape = (64, 64, 3), classes = 6):
    """
    Implementation of the popular ResNet50 the following architecture:
    CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3
    -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER

    Arguments:
    input_shape -- shape of the images of the dataset
    classes -- integer, number of classes

    Returns:
    model -- a Model() instance in Keras
    """
    
    # Define the input as a tensor with shape input_shape
    X_input = Input(input_shape)

    
    # Zero-Padding
    X = ZeroPadding2D((3, 3))(X_input)
    
    # Stage 1
    X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = glorot_uniform(seed=0))(X)
    X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
    X = Activation('relu')(X)
    X = MaxPooling2D((3, 3), strides=(2, 2))(X)

    # Stage 2
    X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1)
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='b')
    X = identity_block(X, 3, [64, 64, 256], stage=2, block='c')


    ### START CODE HERE ###
    
    # Stage 3 (≈4 lines)
    X = convolutional_block(X, f = 3, filters = [128,128, 512], stage = 3, block='a', s = 2)
    X = identity_block(X, 3, [128,128, 512], stage=3, block='b')
    X = identity_block(X, 3, [128,128, 512], stage=3, block='c')
    X = identity_block(X, 3, [128,128, 512], stage=3, block='d')
    
    # Stage 4 (≈6 lines)
    X = convolutional_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block='a', s = 2)
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='b')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='c')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='d')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='e')
    X = identity_block(X, 3, [256, 256, 1024], stage=4, block='f')
    
    # Stage 5 (≈3 lines)
    X = convolutional_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block='a', s = 2)
    X = identity_block(X, 3, [512, 512, 2048], stage=5, block='b')
    X = identity_block(X, 3, [512, 512, 2048], stage=5, block='c')

    # AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)"
    X = AveragePooling2D(pool_size=(2,2))(X)
    
    ### END CODE HERE ###

    # output layer
    X = Flatten()(X)
    X = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = glorot_uniform(seed=0))(X)
    
    # Create model
    model = Model(inputs = X_input, outputs = X, name='ResNet50')

    return model



作者:海街diary
链接:https://www.jianshu.com/p/bd67b8662f55
來源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。

### 残差网络(ResNet)架构及其实现 #### 背景与概念 残差网络(Residual Networks,简称 ResNets)是一种用于解决深层神经网络训练困难问题的创新架构。随着卷积神经网络(CNNs)层数增加,梯度消失或爆炸等问题会显著影响模型性能。然而,ResNets 的核心设计——残差块(residual block),通过引入短路连接(skip connections),允许信息直接传递到后续层[^1]。 这种机制使得误差信号在反向传播过程中更加稳定,从而能够成功训练数百甚至上千层的深度网络。实验表明,在许多计算机视觉任务中,更深的 ResNets 架构通常能带来更好的表现[^1]。 --- #### 结构组成 ResNet 的基本单元是 **残差块**,其主要特点是利用跳跃连接(shortcuts 或 skip connections)。具体来说: - 输入 \( x \) 经过一系列变换操作(如卷积、批量归一化和激活函数)后得到输出 \( F(x) \),然后将其加回到原始输入 \( x \)[^1]。 \[ y = F(x) + x \] 其中,\( F(x) \) 表示经过若干层处理后的特征图,而 \( x \) 是未经修改的输入张量。这一简单的线性叠加方式被称为“恒等映射”,有助于缓解梯度消失/爆炸现象并促进优化过程。 当维度不匹配时(例如通道数不同),可以通过调整 shortcut 连接中的权重矩阵来完成适配[^2]。 --- #### 实现方法 以下是基于 Python 和 TensorFlow/Keras 的简单 ResNet 块实现代码: ```python import tensorflow as tf from tensorflow.keras.layers import Conv2D, BatchNormalization, ReLU, Add def residual_block(input_tensor, filters, strides=1): """ 定义一个标准的残差块 参数: input_tensor: 输入张量 filters: 卷积核数量 strides: 步幅,默认为 1 返回: 输出张量 """ # 主路径部分 x = Conv2D(filters=filters, kernel_size=(3, 3), strides=strides, padding='same')(input_tensor) x = BatchNormalization()(x) x = ReLU()(x) x = Conv2D(filters=filters, kernel_size=(3, 3), strides=1, padding='same')(x) x = BatchNormalization()(x) # Shortcut 部分 if strides != 1 or input_tensor.shape[-1] != filters: shortcut = Conv2D(filters=filters, kernel_size=(1, 1), strides=strides)(input_tensor) shortcut = BatchNormalization()(shortcut) else: shortcut = input_tensor # 合并主路径与 shortcut output = Add()([x, shortcut]) output = ReLU()(output) return output ``` 此代码片段定义了一个基础版本的残差块,适用于构建更复杂的 ResNet 模型。 --- #### 应用场景 ResNet 已成为图像分类、目标检测等多个领域的重要工具之一。例如,在 ImageNet 数据集上的竞赛中,ResNet 屡次刷新记录,并被广泛应用于实际生产环境之中[^1]。 此外,得益于其实现简便性和高效性,研究者们还开发出了各种变体形式,比如 Pre-activation ResNet 及 Wide ResNet 等改进版结构[^2]。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值