剪枝:
[1] Song Han, Huizi Mao,William J.Dally. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman coding[C] ICLR2016.
量化:
[1] Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu,Matthew Tang, Andrew Howard, Hartwig Adam, and DmitryKalenichenko. Quantization and training of neural networksfor efficient integer-arithmetic-only inference. InThe IEEEConference on Computer Vision and Pattern Recognition(CVPR), June 2018.
[2]Raghuraman Krishnamoorthi.Quantizing deep convolu-tional networks for efficient inference: A whitepaper.CoRR,abs/1806.08342, 2018.
[3]Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, andJian Cheng. Quantized convolutional neural networks formobile devices.CoRR, abs/1512.06473, 2015.
[4]Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and YurongChen. Incremental network quantization: Towards losslesscnns with low-precision weights.CoRR, abs/1702.03044,2017.</