[TensorRT] ERROR: xxx/Conv2D: kernel weights has count 1728 but was expected


记录一个自己在部署tensorrt时的报错,以防以后大姨妈,老年痴呆外加前列腺发炎. 这年头一赞难求.


报错

如下:

[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: kernel weights has count 1728 but 294912 was expected
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: count of 1728 weights in kernel, but kernel dimensions (3,3) with 512 input channels, 64 output channels and 1 groups were specified. Expected Weights count is 512 * 3*3 * 64 / 1 = 294912
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: kernel weights has count 1728 but 294912 was expected
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: count of 1728 weights in kernel, but kernel dimensions (3,3) with 512 input channels, 64 output channels and 1 groups were specified. Expected Weights count is 512 * 3*3 * 64 / 1 = 294912
[TensorRT] ERROR: Layer lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D failed validation
[TensorRT] ERROR: Network validation failed.
Completed creating Engine
Traceback (most recent call last):
  File "lanenet_to_tensorRT.py", line 119, in <module>
    main()
  File "lanenet_to_tensorRT.py", line 88, in main
    with get_engine(onnx_file_path, engine_file_path) as engine, engine.create_execution_context() as context:
  File "lanenet_to_tensorRT.py", line 58, in get_engine
    return build_engine()
  File "lanenet_to_tensorRT.py", line 49, in build_engine
    f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'

我估计的原因

tensorRT需要改变数据输入的格式从"NHWC" 变成 “NCHW”, 然后转换算子的时候误把W当成是对应的C来输入了.


解决办法

  1. 在输入进入tensorflow算子的conv2d之前加入个tf.reshape()断开它和输入的格式的关联,具体就是假设输入是 input_tensor = (1,408,408,3) 那就是加个tf.reshape(input_tensor, (1,408,408,3)) 就在输入和输出之间硬插进去了一个tf.reshape算子,断开了和conv2d的关联, 一句话:
    在入口 placeholder 和 第一个conv2d 之间加一个不改变张量shape 的 tf.reshape 节点, 用来隔离入口节点和第一个conv2d节点.

  2. 另外最好把所有tf.shape().as_list()的操作都改成np.shape().as_list()的操作,不过实测影响不大O(∩_∩)O哈哈~.


发现有朋友问了, 可能我讲的不清楚, 这里做个补充:

'''
解决方法: 
在入口 placeholder 和 第一个conv2d 之间加一个不改变张量shape 的 tf.reshape 节点, 用来隔离入口节点和第一个conv2d节点.修改你的tensorflow 模型文件, 一般模型文件以 tf.placeholder()作为模型的输入比如
'''
# 修改前
# 模型
input_tensor = tf.placeholder(shape = [None,]+ image.shape, name="input_tensor")
conv1_1 = tf.conv2d(input_tensor, out_channel=8, kernel=3)
conv2_1 = tf.conv2d(conv1_1, out_channel=64, kernel=3)
output = conv2_1

# 修改后
# 模型
input_tensor = tf.placeholder(shape = [None,]+ image.shape, name="input_tensor")
# 在入口 placeholder 和 第一个conv2d 之间加一个不改变张量shape 的 tf.reshape 节点, 用来隔离入口节点和第一个conv2d节点
reshape_node = tf.reshape(input_tensor, shape= [None,]+ image.shape, name = "reshape_node")
conv1_1 = tf.conv2d(reshape_node, out_channel=8, kernel=3)
conv2_1 = tf.conv2d(conv1_1, out_channel=64, kernel=3)
output = conv2_1

之前以为碰到这个问题基本就是部署lanenet, 所以下面是lanenet的例子.



例子

例子来自https://github.com/MaybeShewill-CV/lanenet-lane-detection

没加之前:

@staticmethod
def conv2d(inputdata, out_channel, kernel_size, padding='SAME',
           stride=1, w_init=None, b_init=None,
           split=1, use_bias=True, data_format='NHWC', name=None):
    """
    Packing the tensorflow conv2d function.
    :param name: op name
    :param inputdata: A 4D tensorflow tensor which ust have known number of channels, but can have other
    unknown dimensions.
    :param out_channel: number of output channel.
    :param kernel_size: int so only support square kernel convolution
    :param padding: 'VALID' or 'SAME'
    :param stride: int so only support square stride
    :param w_init: initializer for convolution weights
    :param b_init: initializer for bias
    :param split: split channels as used in Alexnet mainly group for GPU memory save.
    :param use_bias:  whether to use bias.
    :param data_format: default set to NHWC according tensorflow
    :return: tf.Tensor named ``output``
    """
    with tf.variable_scope(name):
        in_shape = np.shape(inputdata).as_list()
        channel_axis = 3 if data_format == 'NHWC' else 1
        in_channel = in_shape[channel_axis]
        assert in_channel is not None, "[Conv2D] Input cannot have unknown channel!"
        assert in_channel % split == 0
        assert out_channel % split == 0

        padding = padding.upper()

        if isinstance(kernel_size, list):
            filter_shape = [kernel_size[0], kernel_size[1]] + [in_channel / split, out_channel]
        else:
            filter_shape = [kernel_size, kernel_size] + [in_channel / split, out_channel]

        if isinstance(stride, list):
            strides = [1, stride[0], stride[1], 1] if data_format == 'NHWC' \
                else [1, 1, stride[0], stride[1]]
        else:
            strides = [1, stride, stride, 1] if data_format == 'NHWC' \
                else [1, 1, stride, stride]

        if w_init is None:
            w_init = tf.contrib.layers.variance_scaling_initializer()
        if b_init is None:
            b_init = tf.constant_initializer()

        w = tf.get_variable('W', filter_shape, initializer=w_init)
        b = None

        if use_bias:
            b = tf.get_variable('b', [out_channel], initializer=b_init)

        if split == 1:
            conv = tf.nn.conv2d(inputdata, w, strides, padding, data_format=data_format)
        else:
            inputs = tf.split(inputdata, split, channel_axis)
            kernels = tf.split(w, split, 3)
            outputs = [tf.nn.conv2d(i, k, strides, padding, data_format=data_format)
                       for i, k in zip(inputs, kernels)]
            conv = tf.concat(outputs, channel_axis)

        ret = tf.identity(tf.nn.bias_add(conv, b, data_format=data_format)
                          if use_bias else conv, name=name)

    return ret

加了之后:

@staticmethod
def conv2d(inputdata, out_channel, kernel_size, padding='SAME',
           stride=1, w_init=None, b_init=None,
           split=1, use_bias=True, data_format='NHWC', name=None):
    """
    Packing the tensorflow conv2d function.
    :param name: op name
    :param inputdata: A 4D tensorflow tensor which ust have known number of channels, but can have other
    unknown dimensions.
    :param out_channel: number of output channel.
    :param kernel_size: int so only support square kernel convolution
    :param padding: 'VALID' or 'SAME'
    :param stride: int so only support square stride
    :param w_init: initializer for convolution weights
    :param b_init: initializer for bias
    :param split: split channels as used in Alexnet mainly group for GPU memory save.
    :param use_bias:  whether to use bias.
    :param data_format: default set to NHWC according tensorflow
    :return: tf.Tensor named ``output``
    """
    with tf.variable_scope(name):
        in_shape = np.shape(inputdata).as_list()
        channel_axis = 3 if data_format == 'NHWC' else 1
        in_channel = in_shape[channel_axis]
        assert in_channel is not None, "[Conv2D] Input cannot have unknown channel!"
        assert in_channel % split == 0
        assert out_channel % split == 0

        padding = padding.upper()

        if isinstance(kernel_size, list):
            filter_shape = [kernel_size[0], kernel_size[1]] + [in_channel / split, out_channel]
        else:
            filter_shape = [kernel_size, kernel_size] + [in_channel / split, out_channel]

        if isinstance(stride, list):
            strides = [1, stride[0], stride[1], 1] if data_format == 'NHWC' \
                else [1, 1, stride[0], stride[1]]
        else:
            strides = [1, stride, stride, 1] if data_format == 'NHWC' \
                else [1, 1, stride, stride]

        if w_init is None:
            w_init = tf.contrib.layers.variance_scaling_initializer()
        if b_init is None:
            b_init = tf.constant_initializer()

        w = tf.get_variable('W', filter_shape, initializer=w_init)
        b = None

        if use_bias:
            b = tf.get_variable('b', [out_channel], initializer=b_init)

        if split == 1:
            input_shape = np.shape(inputdata)
            shaped_input = tf.reshape(inputdata, input_shape, name='reshape')
            conv = tf.nn.conv2d(shaped_input, w, strides, padding, data_format=data_format)
        else:
        	input_shape = np.shape(inputdata)
            shaped_input = tf.reshape(inputdata, input_shape, name='reshape')
            inputs = tf.split(shaped_input, split, channel_axis)
            kernels = tf.split(w, split, 3)
            outputs = [tf.nn.conv2d(i, k, strides, padding, data_format=data_format)
                       for i, k in zip(inputs, kernels)]
            conv = tf.concat(outputs, channel_axis)

        ret = tf.identity(tf.nn.bias_add(conv, b, data_format=data_format)
                          if use_bias else conv, name=name)

    return ret

tensorRT的速度

可以让tensorflow 本来0.7秒的前推(pb file)变成0.09秒, 效果感人.

参考文献:

https://github.com/NVIDIA/TensorRT/issues/442

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值