Conv2D的报错
记录一个自己在部署tensorrt时的报错,以防以后大姨妈,老年痴呆外加前列腺发炎. 这年头一赞难求.
报错
如下:
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: kernel weights has count 1728 but 294912 was expected
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: count of 1728 weights in kernel, but kernel dimensions (3,3) with 512 input channels, 64 output channels and 1 groups were specified. Expected Weights count is 512 * 3*3 * 64 / 1 = 294912
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: kernel weights has count 1728 but 294912 was expected
[TensorRT] ERROR: lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D: count of 1728 weights in kernel, but kernel dimensions (3,3) with 512 input channels, 64 output channels and 1 groups were specified. Expected Weights count is 512 * 3*3 * 64 / 1 = 294912
[TensorRT] ERROR: Layer lanenet_model/vgg_frontend/vgg16_encode_module/conv1_1/conv/Conv2D failed validation
[TensorRT] ERROR: Network validation failed.
Completed creating Engine
Traceback (most recent call last):
File "lanenet_to_tensorRT.py", line 119, in <module>
main()
File "lanenet_to_tensorRT.py", line 88, in main
with get_engine(onnx_file_path, engine_file_path) as engine, engine.create_execution_context() as context:
File "lanenet_to_tensorRT.py", line 58, in get_engine
return build_engine()
File "lanenet_to_tensorRT.py", line 49, in build_engine
f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'
我估计的原因
tensorRT需要改变数据输入的格式从"NHWC" 变成 “NCHW”, 然后转换算子的时候误把W当成是对应的C来输入了.
解决办法
-
在输入进入tensorflow算子的conv2d之前加入个
tf.reshape()
断开它和输入的格式的关联,具体就是假设输入是input_tensor = (1,408,408,3)
那就是加个tf.reshape(input_tensor, (1,408,408,3))
就在输入和输出之间硬插进去了一个tf.reshape
算子,断开了和conv2d的关联, 一句话:
在入口 placeholder 和 第一个conv2d 之间加一个不改变张量shape 的 tf.reshape 节点, 用来隔离入口节点和第一个conv2d节点. -
另外最好把所有
tf.shape().as_list()
的操作都改成np.shape().as_list()
的操作,不过实测影响不大O(∩_∩)O哈哈~.
发现有朋友问了, 可能我讲的不清楚, 这里做个补充:
'''
解决方法:
在入口 placeholder 和 第一个conv2d 之间加一个不改变张量shape 的 tf.reshape 节点, 用来隔离入口节点和第一个conv2d节点.修改你的tensorflow 模型文件, 一般模型文件以 tf.placeholder()作为模型的输入比如
'''
# 修改前
# 模型
input_tensor = tf.placeholder(shape = [None,]+ image.shape, name="input_tensor")
conv1_1 = tf.conv2d(input_tensor, out_channel=8, kernel=3)
conv2_1 = tf.conv2d(conv1_1, out_channel=64, kernel=3)
output = conv2_1
# 修改后
# 模型
input_tensor = tf.placeholder(shape = [None,]+ image.shape, name="input_tensor")
# 在入口 placeholder 和 第一个conv2d 之间加一个不改变张量shape 的 tf.reshape 节点, 用来隔离入口节点和第一个conv2d节点
reshape_node = tf.reshape(input_tensor, shape= [None,]+ image.shape, name = "reshape_node")
conv1_1 = tf.conv2d(reshape_node, out_channel=8, kernel=3)
conv2_1 = tf.conv2d(conv1_1, out_channel=64, kernel=3)
output = conv2_1
之前以为碰到这个问题基本就是部署lanenet, 所以下面是lanenet的例子.
例子
例子来自https://github.com/MaybeShewill-CV/lanenet-lane-detection
没加之前:
@staticmethod
def conv2d(inputdata, out_channel, kernel_size, padding='SAME',
stride=1, w_init=None, b_init=None,
split=1, use_bias=True, data_format='NHWC', name=None):
"""
Packing the tensorflow conv2d function.
:param name: op name
:param inputdata: A 4D tensorflow tensor which ust have known number of channels, but can have other
unknown dimensions.
:param out_channel: number of output channel.
:param kernel_size: int so only support square kernel convolution
:param padding: 'VALID' or 'SAME'
:param stride: int so only support square stride
:param w_init: initializer for convolution weights
:param b_init: initializer for bias
:param split: split channels as used in Alexnet mainly group for GPU memory save.
:param use_bias: whether to use bias.
:param data_format: default set to NHWC according tensorflow
:return: tf.Tensor named ``output``
"""
with tf.variable_scope(name):
in_shape = np.shape(inputdata).as_list()
channel_axis = 3 if data_format == 'NHWC' else 1
in_channel = in_shape[channel_axis]
assert in_channel is not None, "[Conv2D] Input cannot have unknown channel!"
assert in_channel % split == 0
assert out_channel % split == 0
padding = padding.upper()
if isinstance(kernel_size, list):
filter_shape = [kernel_size[0], kernel_size[1]] + [in_channel / split, out_channel]
else:
filter_shape = [kernel_size, kernel_size] + [in_channel / split, out_channel]
if isinstance(stride, list):
strides = [1, stride[0], stride[1], 1] if data_format == 'NHWC' \
else [1, 1, stride[0], stride[1]]
else:
strides = [1, stride, stride, 1] if data_format == 'NHWC' \
else [1, 1, stride, stride]
if w_init is None:
w_init = tf.contrib.layers.variance_scaling_initializer()
if b_init is None:
b_init = tf.constant_initializer()
w = tf.get_variable('W', filter_shape, initializer=w_init)
b = None
if use_bias:
b = tf.get_variable('b', [out_channel], initializer=b_init)
if split == 1:
conv = tf.nn.conv2d(inputdata, w, strides, padding, data_format=data_format)
else:
inputs = tf.split(inputdata, split, channel_axis)
kernels = tf.split(w, split, 3)
outputs = [tf.nn.conv2d(i, k, strides, padding, data_format=data_format)
for i, k in zip(inputs, kernels)]
conv = tf.concat(outputs, channel_axis)
ret = tf.identity(tf.nn.bias_add(conv, b, data_format=data_format)
if use_bias else conv, name=name)
return ret
加了之后:
@staticmethod
def conv2d(inputdata, out_channel, kernel_size, padding='SAME',
stride=1, w_init=None, b_init=None,
split=1, use_bias=True, data_format='NHWC', name=None):
"""
Packing the tensorflow conv2d function.
:param name: op name
:param inputdata: A 4D tensorflow tensor which ust have known number of channels, but can have other
unknown dimensions.
:param out_channel: number of output channel.
:param kernel_size: int so only support square kernel convolution
:param padding: 'VALID' or 'SAME'
:param stride: int so only support square stride
:param w_init: initializer for convolution weights
:param b_init: initializer for bias
:param split: split channels as used in Alexnet mainly group for GPU memory save.
:param use_bias: whether to use bias.
:param data_format: default set to NHWC according tensorflow
:return: tf.Tensor named ``output``
"""
with tf.variable_scope(name):
in_shape = np.shape(inputdata).as_list()
channel_axis = 3 if data_format == 'NHWC' else 1
in_channel = in_shape[channel_axis]
assert in_channel is not None, "[Conv2D] Input cannot have unknown channel!"
assert in_channel % split == 0
assert out_channel % split == 0
padding = padding.upper()
if isinstance(kernel_size, list):
filter_shape = [kernel_size[0], kernel_size[1]] + [in_channel / split, out_channel]
else:
filter_shape = [kernel_size, kernel_size] + [in_channel / split, out_channel]
if isinstance(stride, list):
strides = [1, stride[0], stride[1], 1] if data_format == 'NHWC' \
else [1, 1, stride[0], stride[1]]
else:
strides = [1, stride, stride, 1] if data_format == 'NHWC' \
else [1, 1, stride, stride]
if w_init is None:
w_init = tf.contrib.layers.variance_scaling_initializer()
if b_init is None:
b_init = tf.constant_initializer()
w = tf.get_variable('W', filter_shape, initializer=w_init)
b = None
if use_bias:
b = tf.get_variable('b', [out_channel], initializer=b_init)
if split == 1:
input_shape = np.shape(inputdata)
shaped_input = tf.reshape(inputdata, input_shape, name='reshape')
conv = tf.nn.conv2d(shaped_input, w, strides, padding, data_format=data_format)
else:
input_shape = np.shape(inputdata)
shaped_input = tf.reshape(inputdata, input_shape, name='reshape')
inputs = tf.split(shaped_input, split, channel_axis)
kernels = tf.split(w, split, 3)
outputs = [tf.nn.conv2d(i, k, strides, padding, data_format=data_format)
for i, k in zip(inputs, kernels)]
conv = tf.concat(outputs, channel_axis)
ret = tf.identity(tf.nn.bias_add(conv, b, data_format=data_format)
if use_bias else conv, name=name)
return ret
tensorRT的速度
可以让tensorflow 本来0.7秒的前推(pb file)变成0.09秒, 效果感人.
参考文献: