1 简介
在深度学习领域,模型的部署是一个关键的环节。而在模型部署中,转换和优化模型以适配不同的硬件和框架是至关重要的。TVM 是一个优秀的深度学习模型优化和部署框架,它提供了丰富的接口和功能来实现模型的高效部署。然而,对于一些特定框架和模型格式,TVM 可能需要进行一些扩展和定制,以更好地支持。在本篇博客中,我们将探讨如何为 PaddlePaddle 的 NHWC 格式模型添加支持,使其能够无缝地与 TVM 集成。
2 准备工作
要为Paddle的NHWC模型在TVM上添加支持,首先要先对TVM的relay算子结构有所了解。以卷积算子为例,我们可以在nn.py中找到对nn.conv2d的定义:
def conv2d(data,
weight,
strides=(1, 1),
padding=(0, 0),
dilation=(1, 1),
groups=1,
channels=None,
kernel_size=None,
data_layout="NCHW",
kernel_layout="OIHW",
out_layout="",
out_dtype="",
):
"""
Parameters
----------
data : tvm.relay.Expr
The input data to the operator.
weight : tvm.relay.Expr
The weight expressions.
strides : Optional[int, Tuple[int]]
The strides of convolution.
padding : Optional[int, Tuple[int]]
The padding of convolution on both sides of inputs before convolution.
dilation : Optional[int, Tuple[int]]
Specifies the dilation rate to be used for dilated convolution.
groups : Optional[int]
Number of groups for grouped convolution.
channels : Optional[int]
Number of output channels of this convolution.
kernel_size : Optional[int, Tuple[int]]
The spatial of the convolution kernel.
data_layout : Optional[str]
Layout of the input.
kernel_layout : Optional[str]
Layout of the weight.
out_layout : Optional[str]
Layout of the output, by default, out_layout is the same as data_layout
out_dtype : Optional[str]
Specifies the output data type for mixed precision conv2d.
Returns
-------
result : tvm.relay.Expr
The computed result.
"""
pass
大部分的参数定义我们基本都能理解,唯一比较不常见的kernel_layout这个属性,这个属性其实代表的是卷积权重的数据格式。在TVM中,针对kernel_layout已经展开过多轮的讨论:
- [TF] Kernel Layout: HWIO vs. HWOI
- [TFLite] OHWI kernel layout for 2D convolution
- [RFC][BYOC] Arm Compute Library integration
由这些讨论我们可以分析得到,在TVM中对于"NCHW"的数据格式,kernel_layout采用"OIHW";对于"NHWC"数据格式,kernel_layout采用"HWIO"。
3 移植conv2d算子
在移植卷积算子前,我们要知道Paddle的kernel_layout是"OIHW"格式的。显然,要移植卷积算子,我们需要先对卷积的权重做通道转换,让kernel_layout从"OIHW"格式转换为"HWIO"格式。我们可以参考一下TFLite中对这个部分是如何处理的,我们可以在tflite.py中找到以下核心代码:
def convert_conv(self, op, conv_type):
"""convolution implementation."""
......
params["kernel_layout"] = "HWIO" if input_c == 1 else "HWOI"
# TFLite kernel layout:
# convolution:
# OC KH KW IC, we require KH KW IC OC (HWIO)
# depthwise convolution:
# 1 KH KW C(input_c * depth_multiplier), we require
# KH KW IC M (depth_multiplier) (HWOI)
if is_depthwise_conv:
weight_value = weight_value.reshape(kernel_h, kernel_w, input_c, depth_multiplier)
else:
weight_value = weight_value.transpose((1, 2, 3, 0))
weight_expr = self.exp_tab.new_const(
weight_value, dtype=weight_tensor_type_str, source_name=weight_tensor.tensor.Name()
)
......
我们可以看到,实际上就是先根据目前的data_layout属性判断出kernel_layout的属性,然后再对weights做通道数据转换操作。同理,在paddlepaddle.py中,我们也可以添加逻辑一致的实现,代码如下:
def convert_conv2d(g, op, block):
"""Operator converter for conv2d."""
dilations = op.attr("dilations")
groups = op.attr("groups")
paddings = op.attr("paddings")
padding_algorithm = op.attr("padding_algorithm")
strides = op.attr("strides")
kernel = g.get_node(op.input("Filter")[0])
kernel_layout = "OIHW"
input_x = g.get_node(op.input("Input")[0])
data_layout = op.attr("data_format")
out_channels, _, k_h, k_w = infer_shape(kernel)
if padding_algorithm == "VALID":
paddings = [0, 0]
elif padding_algorithm == "SAME":
# Handle history issue of PaddlePaddle
# while padding_algorithm == "SAME"
# dilations will be set to [1, 1]
dilations = [1, 1]
input_x = autopad(input_x, strides, [k_h, k_w], dilations)
paddings = [0, 0]
elif padding_algorithm == "EXPLICIT":
if len(paddings) == 2:
paddings = [paddings[0], paddings[1], paddings[0], paddings[1]]
elif len(paddings) == 4:
paddings = [paddings[0], paddings[2], paddings[1], paddings[3]]
else:
msg = f'Value {padding_algorithm} in attribute "padding" of operator Conv is not "valid."'
raise tvm.error.OpAttributeInvalid(msg)
if data_layout == "NHWC":
kernel_layout = "HWIO"
# PaddlePaddle wieght layout is "OIHW", tvm need "HWIO" when op data_format is "NHWC".
kernel_data = g.get_params(op.input("Filter")[0])
kernel_data = kernel_data.asnumpy()
kernel_data = kernel_data.transpose((2, 3, 1, 0))
kernel_data = _nd.array(kernel_data)
g.modify_node(op.input("Filter")[0], kernel_data)
kernel = g.get_node(op.input("Filter")[0])
out = _op.nn.conv2d(
input_x,
kernel,
strides=strides,
padding=paddings,
dilation=dilations,
groups=groups,
channels=out_channels,
kernel_size=[k_h, k_w],
data_layout=data_layout,
kernel_layout=kernel_layout,
)
g.add_node(op.output("Output")[0], out)
4 移植batch_norm算子
在paddlepaddle.py中添加以下代码:
def convert_batch_norm(g, op, block):
"""Operator converter for batch_norm."""
ipt_name = op.input("X")[0]
scale_name = op.input("Scale")[0]
bias_name = op.input("Bias")[0]
mean_name = op.input("Mean")[0]
variance_name = op.input("Variance")[0]
epsilon = op.attr("epsilon")
data_layout = op.attr("data_layout")
if data_layout == "NCHW":
axis = 1
elif data_layout == "NHWC":
axis = 3
else:
msg = f'Value {data_layout} in attribute "batch_norm" of operator Conv is not "valid."'
raise tvm.error.OpAttributeInvalid(msg)
out = _op.nn.batch_norm(
g.get_node(ipt_name), # data
g.get_node(scale_name), # gamma
g.get_node(bias_name), # beta
g.get_node(mean_name), # moving_mean
g.get_node(variance_name), # moving_var
axis=axis,
epsilon=epsilon,
)
g.add_node(op.output("Y")[0], out[0])
5 添加单测
由于PaddleInference目前针对数据格式NHWC的Paddle模型的推理存在问题,详情见 Paddle Issues 61234,因此目前暂时不对此部分添加单测支持。
6 参考资料
- [Frontend][PaddlePaddle] Support conv2d when data_format is NHWC
- [Frontend][PaddlePaddle] Fixed the bug that prevented the model from being successfully converted to microTVM on MacOS
- 卷积的权重(Weight/Kernel/Filter)数据格式采用HWIO/OHWI,还是其他……
- [TF] Kernel Layout: HWIO vs. HWOI
- [TFLite] OHWI kernel layout for 2D convolution
- [RFC][BYOC] Arm Compute Library integration