### Densely Connected Convolutional Networks (DenseNet) 的架构与实现
#### 架构解释
DenseNet 是一种密集连接的卷积神经网络,其核心思想在于通过层间的密集连接来增强特征重用并减少冗余参数。在传统卷积网络中,每一层仅与其相邻的一层相连;而在 DenseNet 中,每层不仅接受前一层的输入,还接受之前所有层的输出作为额外输入[^3]。
这种设计使得 DenseNet 能够显著降低模型复杂度和计算成本,因为各层无需重复学习相同的特征图。具体而言,DenseNet 将每一层的输出视为全局状态的一部分,并允许后续层访问这些状态。因此,最终分类器能够利用整个网络中的所有特征图进行预测[^5]。
#### 实现方法
以下是 DenseNet 的基本构建单元及其 Python 实现:
1. **Dense Block**: 密集块由多个卷积层组成,其中每一层都将自身的输出与其他层的输出拼接在一起。
2. **Transition Layer**: 这些层用于控制特征图的数量以及缩小图像尺寸,通常包括批量归一化(Batch Normalization)、ReLU 和平均池化操作。
下面是基于 PyTorch 的简单实现示例:
```python
import torch.nn as nn
import torch
class Bottleneck(nn.Module):
"""Bottleneck 层"""
def __init__(self, in_channels, growth_rate):
super(Bottleneck, self).__init__()
inter_channel = 4 * growth_rate
self.bn1 = nn.BatchNorm2d(in_channels)
self.conv1 = nn.Conv2d(in_channels, inter_channel, kernel_size=1, bias=False)
self.relu = nn.ReLU(inplace=True)
self.bn2 = nn.BatchNorm2d(inter_channel)
self.conv2 = nn.Conv2d(inter_channel, growth_rate, kernel_size=3, padding=1, bias=False)
def forward(self, x):
out = self.conv1(self.relu(self.bn1(x)))
out = self.conv2(self.relu(self.bn2(out)))
out = torch.cat((x, out), dim=1) # 特征图拼接
return out
class Transition(nn.Module):
"""过渡层"""
def __init__(self, in_channels, out_channels):
super(Transition, self).__init__()
self.bn = nn.BatchNorm2d(in_channels)
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, bias=False)
self.avg_pool = nn.AvgPool2d(kernel_size=2, stride=2)
def forward(self, x):
out = self.conv(self.bn(x))
out = self.avg_pool(out)
return out
class DenseNet(nn.Module):
"""DenseNet 主体结构"""
def __init__(self, num_init_features, growth_rate, block_config, num_classes=1000):
super(DenseNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False),
nn.BatchNorm2d(num_init_features),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
)
num_features = num_init_features
for i, num_layers in enumerate(block_config):
dense_block = self._make_dense_block(growth_rate, num_layers, num_features)
self.features.add_module(f"denseblock{i + 1}", dense_block)
num_features += num_layers * growth_rate
if i != len(block_config) - 1:
trans_layer = Transition(num_features, num_features // 2)
self.features.add_module(f"transition{i + 1}", trans_layer)
num_features //= 2
self.classifier = nn.Linear(num_features, num_classes)
def _make_dense_block(self, growth_rate, n_layers, input_channels):
layers = []
for _ in range(n_layers):
layers.append(Bottleneck(input_channels, growth_rate))
input_channels += growth_rate
return nn.Sequential(*layers)
def forward(self, x):
features = self.features(x)
out = nn.functional.adaptive_avg_pool2d(features, (1, 1))
out = torch.flatten(out, 1)
out = self.classifier(out)
return out
```
上述代码定义了一个基础版本的 DenseNet 模型,其中包括瓶颈层(`Bottleneck`)和过渡层(`Transition`)。通过调整 `growth_rate` 和 `block_config` 参数,可以灵活配置不同的 DenseNet 变种。
---