首先,吐槽下tensorflow lite(1.13)的量化工具太不友好了,不仅相关论文写的晦涩难懂,而且实际操作过程中莫名其妙的问题层出不穷。为了解决这些bug,博主可谓费尽脑汁,终于在头发掉完之前将模型量化成功。
言归正传。
1. 我使用tensorflow lite的目的是将CNN模型从float32量化压缩成uint8的,理论上模型的大小能减少到原来的1/4。为了简单起见,我使用quantization aware training的训练方式(参考:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands),首先训练了一个3层的CNN模型,模型每层的结构都是conv2d+relu+dropout,模型训练好之后,根据常规套路将checkpoint模型freeze、TOCO转换就可以将模型成功量化。
2. 后期将模型每层的结构改成了conv2d+batchnorm+relu,还是按照常规套路进行训练和freeze,但是在TOCO转换的时候出现了八阿哥:
代码:
filter1_height = 3
filter1_width = 5
filter1_count = filter_list[0]
weights1 = tf.Variable(tf.truncated_normal(
[filter1_height, filter1_width, channels, filter1_count],
stddev=0.01))
bias1 = tf.Variable(tf.zeros([filter1_count]))
conv1 = tf.nn.conv2d(fingerprint_4d, weights1, [1, 1, 2, 1], 'VALID')+bias1
bn1 = tf.layers.batch_normalization(conv1, training=is_training)
relu1 = tf.nn.relu(bn1)
bug:
F tensorflow/lite/toco/tooling_util.cc:1702] Array batch_normalization/FusedBatchNorm_mul_0, which is an input to the Add operator producing the output array Relu, is lacking min/max data, which is necessary for quantization. If accuracy matters, either target a non-quan