【TVM】为Paddle NHWC模型添加支持_paddle 数据集转nhwc-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_37380933/article/details/136395002

本文介绍了如何在TVM深度学习框架中为PaddlePaddle的NHWC格式模型添加支持，重点讲解了如何移植conv2d和batch_norm算子，以及处理不同数据布局下的权重转换问题。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 简介

在深度学习领域，模型的部署是一个关键的环节。而在模型部署中，转换和优化模型以适配不同的硬件和框架是至关重要的。TVM 是一个优秀的深度学习模型优化和部署框架，它提供了丰富的接口和功能来实现模型的高效部署。然而，对于一些特定框架和模型格式，TVM 可能需要进行一些扩展和定制，以更好地支持。在本篇博客中，我们将探讨如何为 PaddlePaddle 的 NHWC 格式模型添加支持，使其能够无缝地与 TVM 集成。

2 准备工作

要为Paddle的NHWC模型在TVM上添加支持，首先要先对TVM的relay算子结构有所了解。以卷积算子为例，我们可以在nn.py中找到对nn.conv2d的定义：

def conv2d(data,
    weight,
    strides=(1, 1),
    padding=(0, 0),
    dilation=(1, 1),
    groups=1,
    channels=None,
    kernel_size=None,
    data_layout="NCHW",
    kernel_layout="OIHW",
    out_layout="",
    out_dtype="",
):
    """
        Parameters
    ----------
    data : tvm.relay.Expr
        The input data to the operator.

    weight : tvm.relay.Expr
        The weight expressions.

    strides : Optional[int, Tuple[int]]
        The strides of convolution.

    padding : Optional[int, Tuple[int]]
        The padding of convolution on both sides of inputs before convolution.

    dilation : Optional[int, Tuple[int]]
        Specifies the dilation rate to be used for dilated convolution.

    groups : Optional[int]
        Number of groups for grouped convolution.

    channels : Optional[int]
        Number of output channels of this convolution.

    kernel_size : Optional[int, Tuple[int]]
        The spatial of the convolution kernel.

    data_layout : Optional[str]
        Layout of the input.

    kernel_layout : Optional[str]
        Layout of the weight.

    out_layout : Optional[str]
        Layout of the output, by default, out_layout is the same as data_layout

    out_dtype : Optional[str]
        Specifies the output data type for mixed precision conv2d.
    
    Returns
    -------
    result : tvm.relay.Expr
        The computed result.
    """
    pass

大部分的参数定义我们基本都能理解，唯一比较不常见的kernel_layout这个属性，这个属性其实代表的是卷积权重的数据格式。在TVM中，针对kernel_layout已经展开过多轮的讨论：

由这些讨论我们可以分析得到，在TVM中对于"NCHW"的数据格式，kernel_layout采用"OIHW"；对于"NHWC"数据格式，kernel_layout采用"HWIO"。

3 移植conv2d算子

在移植卷积算子前，我们要知道Paddle的kernel_layout是"OIHW"格式的。显然，要移植卷积算子，我们需要先对卷积的权重做通道转换，让kernel_layout从"OIHW"格式转换为"HWIO"格式。我们可以参考一下TFLite中对这个部分是如何处理的，我们可以在tflite.py中找到以下核心代码：

def convert_conv(self, op, conv_type):
    """convolution implementation."""
    ......
    params["kernel_layout"] = "HWIO" if input_c == 1 else "HWOI"
    # TFLite kernel layout:
    # convolution:
    # OC KH KW IC, we require KH KW IC OC (HWIO)
    # depthwise convolution:
    # 1 KH KW C(input_c * depth_multiplier), we require
    # KH KW IC M (depth_multiplier) (HWOI)
    if is_depthwise_conv:
        weight_value = weight_value.reshape(kernel_h, kernel_w, input_c, depth_multiplier)
    else:
        weight_value = weight_value.transpose((1, 2, 3, 0))

    weight_expr = self.exp_tab.new_const(
        weight_value, dtype=weight_tensor_type_str, source_name=weight_tensor.tensor.Name()
    )
    ......

我们可以看到，实际上就是先根据目前的data_layout属性判断出kernel_layout的属性，然后再对weights做通道数据转换操作。同理，在paddlepaddle.py中，我们也可以添加逻辑一致的实现，代码如下:

def convert_conv2d(g, op, block):
    """Operator converter for conv2d."""

    dilations = op.attr("dilations")
    groups = op.attr("groups")
    paddings = op.attr("paddings")
    padding_algorithm = op.attr("padding_algorithm")
    strides = op.attr("strides")

    kernel = g.get_node(op.input("Filter")[0])
    kernel_layout = "OIHW"
    input_x = g.get_node(op.input("Input")[0])
    data_layout = op.attr("data_format")
    out_channels, _, k_h, k_w = infer_shape(kernel)
    if padding_algorithm == "VALID":
        paddings = [0, 0]
    elif padding_algorithm == "SAME":
        # Handle history issue of PaddlePaddle
        # while padding_algorithm == "SAME"
        # dilations will be set to [1, 1]
        dilations = [1, 1]
        input_x = autopad(input_x, strides, [k_h, k_w], dilations)
        paddings = [0, 0]
    elif padding_algorithm == "EXPLICIT":
        if len(paddings) == 2:
            paddings = [paddings[0], paddings[1], paddings[0], paddings[1]]
        elif len(paddings) == 4:
            paddings = [paddings[0], paddings[2], paddings[1], paddings[3]]
    else:
        msg = f'Value {padding_algorithm} in attribute "padding" of operator Conv is not "valid."'
        raise tvm.error.OpAttributeInvalid(msg)

    if data_layout == "NHWC":
        kernel_layout = "HWIO"
        # PaddlePaddle wieght layout is "OIHW", tvm need "HWIO" when op data_format is "NHWC".
        kernel_data = g.get_params(op.input("Filter")[0])
        kernel_data = kernel_data.asnumpy()
        kernel_data = kernel_data.transpose((2, 3, 1, 0))
        kernel_data = _nd.array(kernel_data)
        g.modify_node(op.input("Filter")[0], kernel_data)
        kernel = g.get_node(op.input("Filter")[0])

    out = _op.nn.conv2d(
        input_x,
        kernel,
        strides=strides,
        padding=paddings,
        dilation=dilations,
        groups=groups,
        channels=out_channels,
        kernel_size=[k_h, k_w],
        data_layout=data_layout,
        kernel_layout=kernel_layout,
    )
    g.add_node(op.output("Output")[0], out)

4 移植batch_norm算子

在paddlepaddle.py中添加以下代码:

def convert_batch_norm(g, op, block):
    """Operator converter for batch_norm."""

    ipt_name = op.input("X")[0]
    scale_name = op.input("Scale")[0]
    bias_name = op.input("Bias")[0]
    mean_name = op.input("Mean")[0]
    variance_name = op.input("Variance")[0]
    epsilon = op.attr("epsilon")
    data_layout = op.attr("data_layout")

    if data_layout == "NCHW":
        axis = 1
    elif data_layout == "NHWC":
        axis = 3
    else:
        msg = f'Value {data_layout} in attribute "batch_norm" of operator Conv is not "valid."'
        raise tvm.error.OpAttributeInvalid(msg)

    out = _op.nn.batch_norm(
        g.get_node(ipt_name),  # data
        g.get_node(scale_name),  # gamma
        g.get_node(bias_name),  # beta
        g.get_node(mean_name),  # moving_mean
        g.get_node(variance_name),  # moving_var
        axis=axis,
        epsilon=epsilon,
    )
    g.add_node(op.output("Y")[0], out[0])