TensorRT实战(二) 如何使用TRT python API搭建简单的VGG16网络

本文详细介绍如何从PyTorch迁移到TensorRT,以VGG16模型为例,展示了权重读取、网络搭建及混合精度等关键步骤。深入解析了PyTorch与TensorRT之间的API转换,提供了完整的代码实现。

2020-01-01 初版
2020-01-10 修改vgg结构至torchvision.models.vgg, 更新代码


一、读入权重并搭建网络

参考TRT提供的官方文档python_samples,注意这个TRT版本是6.0的,目前TRT已经更新到了7.0,不过看Release Note可以发现,TRT6.0与TRT7.0在API上没有变动,因此也不必有所顾忌。另外,由于这个Python Sample必须要将TRT整个给下载下来,才能看到其中的PyThon API的文档,因此这里我给的是自己的仓库链接。github上官方有提供CPP API文档,可见Building a Simple MNIST Network Layer by Layer,不过本文是使用PyThon API搭建,所以就不再谈及CPP API的事。

1.1 分析源码

python_samples/network_api_pytorch_mnist中有README.mdmodel.pysample.pyrequirement.txt,明显的,我们需要具体分析model.pysample.py这两个文件,model.py是使用PyTorch搭建的MNIST网络,sample.py则是使用TRT API搭建,前者皆包含训练、测试过程,后者仅有测试,因此后者没有经过F.log_softmax操作。下面是我摘抄出来的部分核心代码,熟悉的人一眼便可明白:

model.py

# Network
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, kernel_size=5)
        self.conv2 = nn.Conv2d(20, 50, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(800, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x):
        x = F.max_pool2d(self.conv1(x), kernel_size=2, stride=2)
        x = F.max_pool2d(self.conv2(x), kernel_size=2, stride=2)
        x = x.view(-1, 800)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

sample.py

def populate_network(network, weights):
    # Configure the network layers based on the weights provided.
    # 标记网络输入
    input_tensor = network.add_input(name=ModelData.INPUT_NAME, dtype=ModelData.DTYPE, shape=ModelData.INPUT_SHAPE)

    # 对应PyTorch之self.conv1
    conv1_w = weights['conv1.weight'].numpy()
    conv1_b = weights['conv1.bias'].numpy()
    conv1 = network.add_convolution(input=input_tensor, num_output_maps=20, kernel_shape=(5, 5), kernel=conv1_w, bias=conv1_b)
    conv1.stride = (1, 1)

    # 对应PyTorch之F.max_pool2d
    pool1 = network.add_pooling(input=conv1.get_output(0), type=trt.PoolingType.MAX, window_size=(2, 2))
    pool1.stride = (2, 2)

    # 对应PyTorch之self.conv2
    conv2_w = weights['conv2.weight'].numpy()
    conv2_b = weights['conv2.bias'].numpy()
    conv2 = network.add_convolution(pool1.get_output(0), 50, (5, 5), conv2_w, conv2_b)
    conv2.stride = (1, 1)

    # 对应PyTorch之F.max_pool2d
    pool2 = network.add_pooling(conv2.get_output(0), trt.PoolingType.MAX, (2, 2))
    pool2.stride = (2, 2)

    # 对应PyTorch之self.fc1
    fc1_w = weights['fc1.weight'].numpy()
    fc1_b = weights['fc1.bias'].numpy()
    fc1 = network.add_fully_connected(input=pool2.get_output(0), num_outputs=500, kernel=fc1_w, bias=fc1_b)

    # 对应PyTorch之self.relu
    relu1 = network.add_activation(input=fc1.get_output(0), type=trt.ActivationType.RELU)

    # 对应PyTorch之self.fc2
    fc2_w = weights['fc2.weight'].numpy()
    fc2_b = weights['fc2.bias'].numpy()
    fc2 = network.add_fully_connected(relu1.get_output(0), ModelData.OUTPUT_SIZE, fc2_w, fc2_b)

    # 设置该层输出名字
    fc2.get_output(0).name = ModelData.OUTPUT_NAME
    # 标记网络输出
    network.mark_output(tensor=fc2.get_output(0))

sample.pypopulate_network中,network是返回值,weights是输入值,对应model.pyNetNet.state_dict(),注意weights是加载在CPU上的。

两者相互比较、对应,对于PyTorch而言,输入x首先经过conv1卷积、F.max_pool2d池化、conv2卷积、F.max_pool2d池化、view(-1)一维化、relu激活、fc全连接、F.log_softmax归一化输出结果概率分布;对于TRT而言,整个链路的行为需要跟PyTorch一致,不同的是TRT不需要训练,因此就不需要log_softmax了。比较两者可以归结如下表所示,简单且复杂:

PyTorch Operators TRT API Operators
self.conv1 = nn.Conv2d(1, 20, kernel_size=5) conv1_w = weights[‘conv1.weight’].numpy()
conv1_b = weights[‘conv1.bias’].numpy()
conv1 = network.add_convolution(input=input_tensor, num_output_maps=20, kernel_shape=(5, 5), kernel=conv1_w, bias=conv1_b)
conv1.stride = (1, 1)
F.max_pool2d(self.conv1(x), kernel_size=2, stride=2) pool1 = network.add_pooling(input=conv1.get_output(0), type=trt.PoolingType.MAX, window_size=(2, 2))
pool1.stride = (2, 2)
self.conv2 = nn.Conv2d(20, 50, kernel_size=5) conv2_w = weights[‘conv2.weight’].numpy()
conv2_b = weights[‘conv2.bias’].numpy()
conv2 = network.add_convolution(pool1.get_output(0), 50, (5, 5), conv2_w, conv2_b)
评论 22
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值