20.初识Pytorch使用cuda对模型进行训练和测试或使用cuda对模型进行训练再用cpu测试Use cuda to train and test or use cpu test

游客26024

已于 2022-02-17 10:24:49 修改

阅读量3.3k

点赞数 6

分类专栏：手把手学习Pytorch 文章标签： pytorch 深度学习神经网络人工智能计算机视觉

于 2022-01-28 23:42:28 首次发布

本文链接：https://blog.youkuaiyun.com/XiaoyYidiaodiao/article/details/122737775

版权

手把手学习Pytorch 专栏收录该内容

23 篇文章

订阅专栏

1.训练

如果使用cuda进行训练,则需要在以下三个地方进行修改，告诉计算机使用的是cuda,并且有两种方式(待会再讲)：
If using cuda for training, you need to modify the following three places to tell the computer to use cuda, and there are two ways (more on this later):

1.网络结构
1.Network structure

2.损失函数
2.Loss function

3.数据马上使用之前
Data,immediately before use

two way that we can use cuda:
1. xx.cuda()
2. xx.to(device=torch.device("cuda"))

方式(way)1:

1.network structure
model.cuda()

2.loss function
cross_entropy_loss.cuda()

3.data,immediately before use
imgs,targets = data
imgs.cuda()
targets.cuda() 

注意：其实这种方式应该在最训练代码的最前面写argparse.ArgumentParser()才比较好用，但是为了方便代码好读，就不写这么难。
PS:In fact, this method should be better to write argparse.ArgumentParser() at the top of the most training code, but in order to make the code easier to read, it is not so difficult to write.

上代码(code):

from torch.utils.data import DataLoader
from LeNet_5 import *
import torchvision
import torch
from torch import nn
from torch.utils.tensorboard import SummaryWriter


# 1.Create SummaryWriter
writer = SummaryWriter("log_loss")

# 2.Ready dataset
train_dataset = torchvision.datasets.CIFAR10(root="data", train=True, transform=torchvision.transforms.ToTensor(),
                                             download=True)

# 3.Length
train_dataset_size = len(train_dataset)
print("the train dataset size is {}".format(train_dataset_size))

# 4.DataLoader
train_dataloader = DataLoader(dataset=train_dataset, batch_size=64)

# 5.Create model
model = LeNet_5()
# a.add cuda
if torch.cuda.is_available():
    model = model.cuda()

# 6.Create loss
cross_entropy_loss = nn.CrossEntropyLoss()
# b.add cuda
cross_entropy_loss = cross_entropy_loss.cuda()

# 7.Optimizer
learning_rate = 1e-2
optim = torch.optim.SGD(model.parameters(), lr=learning_rate)

# 8. Set some parameters to control loop
# epoch
epoch = 80

total_train_step = 0

for i in range(epoch):
    print(" -----------------the {} number of training epoch --------------".format(i + 1))
    model.train()
    for data in train_dataloader:
        imgs, targets = data
        # c.add cuda
        if torch.cuda.is_available():
            imgs = imgs.cuda()
            targets = targets.cuda()
        outputs = model(imgs)
        loss_train = cross_entropy_loss(outputs, targets)

        optim.zero_grad()
        loss_train.backward()
        optim.step()
        total_train_step = total_train_step + 1
        if total_train_step % 100 == 0:
            print("the training step is {} and its loss of model is {}".format(total_train_step, loss_train.item()))
            writer.add_scalar("train_loss", loss_train.item(), total_train_step)
            if total_train_step % 10000 == 0:
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
            if i == (epoch - 1):
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
writer.close()

方式(way)2:

1.network structure
model.to(device=torch.device("cuda"))

2.loss function
cross_entropy_loss.to(device=torch.device("cuda"))

3.data,immediately before use
imgs,targets = data
imgs.to(device=torch.device("cuda"))
targets.to(device=torch.device("cuda"))

上代码(code):

from torch.utils.data import DataLoader
from LeNet_5 import *
import torchvision
import torch
from torch import nn
from torch.utils.tensorboard import SummaryWriter

# 1. torch choose cuda or cpu
if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

# 2.Create SummaryWriter
writer = SummaryWriter("log_loss")

# 3.Ready dataset
train_dataset = torchvision.datasets.CIFAR10(root="data", train=True, transform=torchvision.transforms.ToTensor(),
                                             download=True)

# 4.Length
train_dataset_size = len(train_dataset)
print("the train dataset size is {}".format(train_dataset_size))

# 5.DataLoader
train_dataloader = DataLoader(dataset=train_dataset, batch_size=64)

# 6.Create model
model = LeNet_5()
# a.add cuda
model = model.to(device=device)

# 7.Create loss
cross_entropy_loss = nn.CrossEntropyLoss()
# b.add cuda
cross_entropy_loss = cross_entropy_loss.to(device=device)

# 8.Optimizer
learning_rate = 1e-2
optim = torch.optim.SGD(model.parameters(), lr=learning_rate)

# 9. Set some parameters to control loop
# epoch
epoch = 80

total_train_step = 0

for i in range(epoch):
    print(" -----------------the {} number of training epoch --------------".format(i + 1))
    model.train()
    for data in train_dataloader:
        imgs, targets = data
        imgs = imgs.to(device)
        targets = targets.to(device)
        outputs = model(imgs)
        loss_train = cross_entropy_loss(outputs, targets)

        optim.zero_grad()
        loss_train.backward()
        optim.step()
        total_train_step = total_train_step + 1
        if total_train_step % 100 == 0:
            print("the training step is {} and its loss of model is {}".format(total_train_step, loss_train.item()))
            writer.add_scalar("train_loss", loss_train.item(), total_train_step)
            if total_train_step % 10000 == 0:
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
            if i == (epoch - 1):
                torch.save(model.state_dict(), "model_save/model_{}_GPU.pth".format(total_train_step))
                print("the model of {} training step was saved! ".format(total_train_step))
writer.close()

2.测试

2.1.使用cuda训练，使用cpu测试
Use cuda to train, and then use cpu to test

上代码(code):

import torch
from torch.utils.data import DataLoader
from LeNet_5 import *
import torchvision

# test

# 1.Create model
model = LeNet_5()

# 2.Ready Dataset
test_dataset = torchvision.datasets.CIFAR10(root="data", train=False, transform=torchvision.transforms.ToTensor(),
                                            download=True)
# 3.Length
test_dataset_size = len(test_dataset)
print("the test dataset size is {}".format(test_dataset_size))

# 4.DataLoader
test_dataloader = DataLoader(dataset=test_dataset, batch_size=64)

# 5. Set some parameters for testing the network
total_accuracy = 0

# test
model.eval()
with torch.no_grad():
    for data in test_dataloader:
        imgs, targets = data
        model_load = torch.load("model_save/model_62500_GPU.pth", map_location=torch.device("cpu"))
        model.load_state_dict(model_load)
        outputs = model(imgs)
        accuracy = (outputs.argmax(1) == targets).sum()
        total_accuracy = total_accuracy + accuracy
        accuracy = total_accuracy / test_dataset_size
    print("the total accuracy is {}".format(accuracy))

2.2.使用cuda训练，使用cuda测试
Use cuda to train, and then also use cuda to test

import torch
from torch.utils.data import DataLoader
from LeNet_5 import *
import torchvision

# test

# 1.Create model
model = LeNet_5()
if torch.cuda.is_available():
    model = model.cuda()

# 2.Ready Dataset
test_dataset = torchvision.datasets.CIFAR10(root="data", train=False, transform=torchvision.transforms.ToTensor(),
                                            download=True)
# 3.Length
test_dataset_size = len(test_dataset)
print("the test dataset size is {}".format(test_dataset_size))

# 4.DataLoader
test_dataloader = DataLoader(dataset=test_dataset, batch_size=64)

# 5. Set some parameters for testing the network
total_accuracy = 0

# test
model.eval()
with torch.no_grad():
    for data in test_dataloader:
        imgs, targets = data
        # add cuda
        if torch.cuda.is_available():
            imgs = imgs.cuda()
            targets = targets.cuda()
        model_load = torch.load("model_save/model_62500_GPU.pth")
        model.load_state_dict(model_load)
        outputs = model(imgs)
        accuracy = (outputs.argmax(1) == targets).sum()
        total_accuracy = total_accuracy + accuracy
        accuracy = total_accuracy / test_dataset_size
    print("the total accuracy is {}".format(accuracy))

其运行结果，可参考之前章节，这里不再过多阐述。
For the results, please refer to the previous chapters, which I will not be elaborated here.

上一章 19.初识Pytorch之完整的模型训练套路-整理后的代码 Complete model training routine - compiled code

未完待续…
To be continued…