### ResNet18 的完整网络结构
ResNet18 是一种较浅版本的残差神经网络,具有 18 层可训练权重层。该模型的设计基于 VGG19 网络并引入了短路机制以形成残差单元[^2]。
#### 初始卷积模块 Conv1
初始阶段由单个较大的卷积操作构成,具体参数设置如下:
- 卷积核尺寸:7×7
- 步幅 (Stride) :2
- 填充 (Padding): 3
紧随其后的是一组批标准化(Batch Normalization, BN),激活函数 ReLU 和最大池化(Max Pooling)[^3]:
```python
import torch.nn as nn
class InitialConv(nn.Module):
def __init__(self):
super(InitialConv, self).__init__()
self.conv = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=(7, 7), stride=2, padding=3, bias=False)
self.bn = nn.BatchNorm2d(num_features=64)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
def forward(self, x):
x = self.conv(x)
x = self.bn(x)
x = self.relu(x)
x = self.maxpool(x)
return x
```
#### 主干网络 BasicBlock 结构
ResNet18 中的基本构建块被称为 `BasicBlock` ,它通常包含两个 3 × 3 的标准二维卷积层以及跳跃连接(skip connection)。每当特征映射(feature map)大小减半时,通道数会加倍以维持计算量不变:
```python
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, inplanes, planes, stride=1, downsample=None):
super().__init__()
# First convolutional layer with a given stride value.
self.conv1 = conv3x3(inplanes, planes, stride)
self.bn1 = nn.BatchNorm2d(planes)
# Second convolutional layer always has stride of 1.
self.conv2 = conv3x3(planes, planes)
self.bn2 = nn.BatchNorm2d(planes)
self.downsample = downsample
self.stride = stride
def forward(self, x):
identity = x
out = F.relu(self.bn1(self.conv1(x)))
out = self.bn2(self.conv2(out))
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = F.relu(out)
return out
```
整个 ResNet18 架构可以被划分为四个不同的 stage 或者称为 layers,每个 stage 都重复应用上述基本块若干次:
| Stage | Number of Blocks |
|-------|------------------|
| Layer 1 | 2 |
| Layer 2 | 2 |
| Layer 3 | 2 |
| Layer 4 | 2 |
最后,全局平均池化层和全连接分类器完成最终预测输出:
```python
def _make_layer(block, planes, blocks, stride=1):
downsample = None
if stride != 1 or inplanes != block.expansion * planes:
downsample = nn.Sequential(
nn.Conv2d(inplanes, block.expansion*planes,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(block.expansion*planes)
)
layers = []
layers.append(block(inplanes, planes, stride, downsample))
inplanes = planes * block.expansion
for i in range(1, blocks):
layers.append(block(inplanes, planes))
return nn.Sequential(*layers)
class ResNet(nn.Module):
def __init__(self, block, num_blocks, num_classes=10):
...
self.layer1 = _make_layer(block, 64, num_blocks[0], stride=1)
self.layer2 = _make_layer(block, 128, num_blocks[1], stride=2)
self.layer3 = _make_layer(block, 256, num_blocks[2], stride=2)
self.layer4 = _make_layer(block, 512, num_blocks[3], stride=2)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(512 * block.expansion, num_classes)
def forward(self, x):
x = self.initial_conv(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)
return x
def resnet18():
return ResNet(BasicBlock, [2, 2, 2, 2])
```