模型压缩之ncnn量化
ncnn
ncnn 是一个为手机端极致优化的高性能神经网络前向计算框架。ncnn 从设计之初深刻考虑手机端的部署和使用。无第三方依赖,跨平台,手机端 cpu 的速度快于目前所有已知的开源框架。基于 ncnn,开发者能够将深度学习算法轻松移植到手机端高效执行,开发出人工智能 APP,将 AI 带到你的指尖。ncnn 目前已在腾讯多款应用中使用,如 QQ,Qzone,微信,天天P图等。
1. 安装
-
下载ncnn
git clone https://github.com/Tencent/ncnn
-
进入ncnn根目录
cd<ncnn-root-dir>
-
注:在CMakeLists.txt中取消注释add_subdirectory(examples),以便编译examples中的cpp文件。
-
执行以下命令,编译ncnn:
$ mkdir -p build
$ cd build
$ cmake ..
$ make -j4
$ make install
这样就得到了build/examples文件下的多个模型的可执行文件。
2. 使用
Squeezenet进行分类测试
- 移动ncnn所需的param和bin文件到build/examples/
$ cp examples/squeezenet_v1.1.param build/examples/
$ cp examples/squeezenet_v1.1.bin build/examples/
- 分类预测
$ cd build/examples/
$ ./squeezenet dog.jpg
- 预测结果
258 = 0.191417
257 = 0.109412
151 = 0.060365
3. ncnn量化:8位整型的实现
- Initial the int8 quantize inference implement
- 工具:caffe-int8-convert-tools
- 8-bit Inference with TensorRT
量化cifar_small模型
cifar_small模型是DarkNet框架的小型网络,它包含7层卷积。其中,将其模型文件cifar_small.cfg改写成caffe框架的配置文件请参考这里。
- 准备caffe网络和模型
train.prototxt
deploy.prototxt
snapshot_10000.caffemodel
- 执行量化脚本
python caffe-int8-convert-tool-dev.py --proto=cifar_small-master/cifar_small_deploy.prototxt --model=cifar_small-master/cifar_small_iter_10000.caffemodel --mean 125.3 123.0 113.9 --norm=1 --images=cifar_test_100/ --output=cifar_small.table --group=1 --gpu=0
- 得到量化结果:
cifar_test_100//128_dog.png forward time : 0.001 s
loop stage 2 : 54
add cost 0.003 s
normalize_distribution 168664 2048
caffe-int8-convert-tool-dev.py:193: RuntimeWarning: divide by zero encountered in true_divide
return np.sum(dist_a[nonzero_inds] * np.log(dist_a[nonzero_inds] / dist_b[nonzero_inds]))
conv1 group : 0 bin : 2034 threshold : 140.169904 interval : 0.068896 scale : 0.906043
normalize_distribution 450560 2048
conv2 group : 0 bin : 1189 threshold : 3.555792 interval : 0.002989 scale : 35.716380
normalize_distribution 225280 2048
conv3 group : 0 bin : 1545 threshold : 2.780417 interval : 0.001799 scale : 45.676601
normalize_distribution 225280 2048
conv4 group : 0 bin : 1569 threshold : 1.943042 interval : 0.001238 scale : 65.361413
normalize_distribution 112640 2048
conv5 group : 0 bin : 1588 threshold : 2.604649 interval : 0.001640 scale : 48.758978
normalize_distribution 450560 2048
conv6 group : 0 bin : 1287 threshold : 1.208493 interval : 0.000939 scale : 105.089525
normalize_distribution 225280 2048
conv7 group : 0 bin : 1175 threshold : 5.926937 interval : 0.005042 scale : 21.427592
Caffe Int8 Calibration table create success, it's cost 0:00:41.287920, best wish for your INT8 inference has a low accuracy loss...\(^▽^)/...2333...
生成 cifarsmall.table文件
- 统计量化结果
参考squeezent.cpp并利用darknet的validate_classifier_single函数进行批量测试:
// Tencent is pleased to support the open source community by making ncnn available.
2 //
3 // Copyright (C) 2017 THL A29 Limited, a Tencent company. All rights reserved.
4 //
5 // Licensed under the BSD 3-Clause License (the "License"); you may not use this file except
6 // in compliance with the License. You may obtain a copy of the License at
7 //
8 // https://opensource.org/licenses/BSD-3-Clause
9 //
10 // Unless required by applicable law or agreed to in writing, software distributed
11 // under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
12 // CONDITIONS OF ANY KIND, either express or implied. See the License for the
13 // specific language governing permissions and limitations under the License.
14
15 #include