tensorRT笔记十一(YOLOv5模型导出编译和推理)

一、yolov5-6.0 ONNX修改

  • yolov5-6.0导出的onnx模型多个动态维度,杂乱不整齐且有多余输出,会造成占用多余显存,推理较慢,对onnx进行系列修改

1. 输入有3个动态,修改为只有batch是动态

  • export.pyexport_onnx函数中修改
# 修改前
torch.onnx.export(model, im, f, verbose=False, opset_version=opset,
                          training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
                          do_constant_folding=not train,
                          input_names=['images'],
                          output_names=['output'],
                          dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'},  # shape(1,3,640,640)
                                        'output': {0: 'batch', 1: 'anchors'}  # shape(1,25200,85)
                                        } if dynamic else None)
# 修改后
torch.onnx.export(model, im, f, verbose=False, opset_version=opset,
                          training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
                          do_constant_folding=not train,
                          input_names=['images'],
                          output_names=['output'],
                          dynamic_axes={'images': {0: 'batch'},  # shape(1,3,640,640)
                                        'output': {0: 'batch'}  # shape(1,25200,85)
                                        } if dynamic else None)
  • onnx输入修改前

请添加图片描述
修改

  • onnx输入修改后
    请添加图片描述

2. 原onnx有4个输出,只保留output输出,其他删除

  • model文件夹下的yolo.py中的Detect类中的forward函数中进行修改
# 修改前
return x if self.training else (torch.cat(z, 1), x)
# 修改后
return x if self.training else torch.cat(z, 1)
  • 修改前:

请添加图片描述

  • 修改后:请添加图片描述

3. onnx中杂乱的Gather和Unsqueeze,修改reshape返回值造成的节点增加

  • model文件夹下的yolo.py中的Detect类中的forward`函数中进行修改
# 修改前,bs,ny,nx变量接收shape,导致view过程中参数是变量,导致节点增加
  bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)
  x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
# 修改后,使用map将bs,ny,nx转为常量
  bs, _, ny, nx = map(int, x[i].shape)  # x(bs,255,20,20) to x(bs,3,20,20,85)
  x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

  • 修改前
    请添加图片描述
  • 修改后请添加图片描述 ### 4. reshape中的batch不为动态,会造成错误
  • model文件夹下的yolo.py中的Detect类中的forward`函数中进行修改
# 修改前
 x[i] = self.m[i](x[i])  # conv
 bs, _, ny, nx = map(int, x[i].shape)  # x(bs,255,20,20) to x(bs,3,20,20,85)
 x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

 if not self.training:  # inference
     if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic:
         self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

     y = x[i].sigmoid()
     if self.inplace:
         y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
         y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
     else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
         xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
         wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
         y = torch.cat((xy, wh, y[..., 4:]), -1)
     z.append(y.view(bs, -1, self.no))
 # 修改后,bs值为-1,reshape后的中间的值进行计算为self.na * ny * nx
 
  • 修改前
    请添加图片描述
  • 修改后
    请添加图片描述

4. onnx中杂乱的expand

  • model文件夹下的yolo.py中的Detect类中的forward`函数中进行修改
# 修改前
z = []  # inference output
for i in range(self.nl):
 x[i] = self.m[i](x[i])  # conv
 bs, _, ny, nx = map(int, x[i].shape)  # x(bs,255,20,20) to x(bs,3,20,20,85)
 bs = -1
 x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

 if not self.training:  # inference
     if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic:
         self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

     y = x[i].sigmoid()
     if self.inplace:
         y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
         y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
     else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
         xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
         wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
         y = torch.cat((xy, wh, y[..., 4:]), -1)
     z.append(y.view(bs, self.na * ny * nx, self.no))
# 修改后,由_make_grid函数生成的anchor_grid由anchors和stride变量expand计算得到,会一直在onnx中跟踪,需要将anchor_grid断开跟踪,变为常量值
  z = []  # inference output
  for i in range(self.nl):
      x[i] = self.m[i](x[i])  # conv
      bs, _, ny, nx = map(int, x[i].shape)  # x(bs,255,20,20) to x(bs,3,20,20,85)
      bs = -1
      x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

      if not self.training:  # inference
          if self.grid[i].shape[2:4] != x[i].shape[2:4] or self.onnx_dynamic:
              self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

          anchor_grid = (self.anchors[i].clone() * self.stride[i]).view(1, -1, 1, 1, 2)

          y = x[i].sigmoid()
          if self.inplace:
              y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
              y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * anchor_grid  # wh
          else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
              xy = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy
              wh = (y[..., 2:4] * 2) ** 2 * anchor_grid  # wh
              y = torch.cat((xy, wh, y[..., 4:]), -1)
          z.append(y.view(bs, self.na * ny * nx, self.no))

  return x if self.training else torch.cat(z, 1)
  • 修改前
    请添加图片描述
  • 修改后
    请添加图片描述

二、cuda进行前处理

三、cuda进行后处理

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值