### 实现ResNet网络解决CIFAR-10分类问题
为了使用 PyTorch 实现 ResNet 网络来处理 CIFAR-10 数据集上的图像分类任务,下面提供了完整的 Python 代码示例。此实现基于官方的 PyTorch 库以及 torchvision 提供的支持工具。
#### 加载所需库并定义转换操作
首先导入必要的包,并设定数据预处理的方式:
```python
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
transform = transforms.Compose(
[transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
```
上述代码设置了随机裁剪、水平翻转等增强手段以提高泛化能力[^2]。
#### 准备训练和测试数据集
接着创建 `DataLoader` 对象以便于后续迭代获取批次样本:
```python
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128,
shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=100,
shuffle=False, num_workers=2)
```
这里指定了批量大小为 128 和 100 分别用于训练集和验证集的数据加载器配置。
#### 定义ResNet模型架构
利用 TorchVision 中已有的 ResNet 结构作为基础框架,调整最后一层以匹配 CIFAR-10 类别的数量:
```python
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, in_planes, planes, stride=1):
super(BasicBlock, self).__init__()
# ...省略部分初始化代码...
def make_layer(block, planes, blocks, stride=1):
downsample = None
if stride != 1 or inplanes != block.expansion * planes:
downsample = nn.Sequential(
conv1x1(inplanes, block.expansion*planes, stride),
norm_layer(block.expansion*planes),
)
layers = []
layers.append(block(inplanes, planes, stride, downsample))
inplanes = planes * block.expansion
for _ in range(1, blocks):
layers.append(block(inplanes, planes))
return nn.Sequential(*layers)
class ResNet(nn.Module):
def __init__(self, block, layers, num_classes=10):
super(ResNet, self).__init__()
self.inplanes = 64
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(512 * block.expansion, num_classes)
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)
return x
model = ResNet(BasicBlock, [2, 2, 2, 2]) # 这里选择了ResNet18结构
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device);
```
该段代码展示了自定义 ResNet 模型的设计思路及其具体构造方法[^1]。
#### 设置优化算法与损失函数
指定 Adam 优化器及交叉熵损失计算方式:
```python
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
```
这部分实现了常用的梯度下降策略来最小化预测误差。
#### 训练过程
编写循环逻辑完成多次前向传播、反向传播更新权重的过程:
```python
for epoch in range(epochs):
running_loss = 0.0
model.train()
for i, data in enumerate(trainloader, 0):
inputs, labels = data[0].to(device), data[1].to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i % 200 == 199:
print(f'[Epoch {epoch + 1}, Batch {i + 1}] Loss: %.3f' %
(running_loss / 200))
running_loss = 0.0
scheduler.step()
```
通过以上步骤完成了整个训练流程中的核心环节——参数学习。
#### 测试评估性能
最后,在独立的测试集合上检验最终模型的效果:
```python
correct = 0
total = 0
with torch.no_grad():
model.eval()
for data in testloader:
images, labels = data[0].to(device), data[1].to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy on the 10000 test images: %d %%' % (
100 * correct / total))
```
这一步骤能够直观反映所建立模型的实际识别精度。