前面介绍了PyTorch的一些基本用法,从这一节开始介绍Pytorch在深度学习中的应用。在开始介绍之前,首先熟悉一下常用的概念和层。
class torch.nn.Module
- 是所有神经网络模块的基类,自定义的网络模块必须继承此模块
- 必须重写forward方法,也即前传模块
举例:
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
在前面的线性回归和逻辑回归中同样用到了此模块,用法也是类似。
class torch.nn.Sequential(*args)
- 多个模块按照它们传入构造函数的顺序被加入到网络中去
举例:
# Example of using Sequential
model = nn.Sequential(
nn.Conv2d(1,20,5),
nn.ReLU(),
nn.Conv2d(20,64,5),
nn.ReLU()
)
2D Convolution
class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)
其中
- in_channels 为输入数据通道数
- out_channels 为输出数据通道数
- kernel_size kernel大小
其余几个参数跟caffe一样。
2D Normalization
class torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True)
其中
- num_features 为输入数据的通道数
BatchNorm2d计算的是每个通道上的归一化特征,公式为
2D Pooling
class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
其中
- kernel_size 为kernel大小
其余参数和caffe一样。
pooling之后的特征图大小计算方式为
convolutional_neural_network
接下来就是见证奇迹的时刻,让我们来看看一个简单的卷积神经网络是如何构建的。
首先和前面几节一样加载数据
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable
# Hyper Parameters
num_epochs = 5
batch_size = 100
learning_rate = 0.001
# MNIST Dataset
train_dataset = dsets.MNIST(root='./data/',
train=True,
transform=transforms.ToTensor(),
download=True)
test_dataset = dsets.MNIST(root='./data/',
train=False,
transform=transforms.ToTensor())
# Data Loader (Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
batch_size=batch_size,
shuffle=False)
构建卷积神经网络,两个卷积层,一个线性层。
# CNN Model (2 conv layer)
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1, 16, kernel_size=5, padding=2),
nn.BatchNorm2d(16),
nn.ReLU(),
nn.MaxPool2d(2))
self.layer2 = nn.Sequential(
nn.Conv2d(16, 32, kernel_size=5, padding=2),
nn.BatchNorm2d(32),
nn.ReLU(),
nn.MaxPool2d(2))
self.fc = nn.Linear(7*7*32, 10)
def forward(self, x):
out = self.layer1(x)
out = self.layer2(out)
out = out.view(out.size(0), -1)
out = self.fc(out)
return out
cnn = CNN()
定义loss和优化算法。
# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate)
开始训练。
# Train the Model
for epoch in range(num_epochs):
for i, (images, labels) in enumerate(train_loader):
images = Variable(images)
labels = Variable(labels)
# Forward + Backward + Optimize
optimizer.zero_grad()
outputs = cnn(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if (i+1) % 100 == 0:
print ('Epoch [%d/%d], Iter [%d/%d] Loss: %.4f'
%(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0]))
测试。
#有BN层或drop层时需加上cnn.eval(),因为计算方式不一样,可以参考caffe源码。
# Test the Model
cnn.eval() # Change model to 'eval' mode (BN uses moving mean/var).
correct = 0
total = 0
for images, labels in test_loader:
images = Variable(images)
outputs = cnn(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum()
print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total))
最终结果为正确率99%,训练耗时7分钟左右。按照前一节的做法改为GPU版本后,正确率为99%,训练耗时34秒。由于加载数据只有一个worker,将DataLoader的num_workers设为4后,训练耗时17秒。
相较于上一节的两个线性层,本节的两个卷积层加上一个线性层的做法在准确率上有所提升,并且参数量更少。上一节隐层的参数量为784x500=392000,这一节卷积层的参数量为16x1x5x5+32x16x5x5=13200。