标准卷积和可分离卷积_关于可分离卷积的介绍和文献综述-优快云博客

本文介绍了标准卷积与可分离卷积的区别，重点探讨了可分离卷积的原理和优势，结合相关文献进行了深入解析，对于理解计算机视觉领域的卷积操作具有指导意义。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

标准卷积和可分离卷积

And why would you want to apply them in your machine learning projects.

以及为什么要在机器学习项目中应用它们。

By Greta Elvestuen, PhD, Data Scientist consultant at Inmeta

Inmeta数据科学家顾问Greta Elvestuen博士

Throughout the last decade, convolutional neural networks (CNNs) have brought significant improvements in performance to machine learning models. Especially deep learning models have gained high popularity within the field of computer vision, due to their vast achievements and implementation in a wide variety of everyday applications.

在过去的十年中，卷积神经网络(CNN)大大提高了机器学习模型的性能。尤其是深度学习模型，由于其在众多日常应用中的巨大成就和实现而在计算机视觉领域获得了高度普及。

However, training such networks is very time-consuming. Large datasets are necessary in order to train a high performing model, leading to excessive training times for up to several weeks. This is especially unfortunate in cases of testing how well the network is performing in order to make the necessary adjustments. Even with application of virtual machines and far more powerful GPUs nowadays, training times still represent a challenge in machine learning projects.

但是，训练这样的网络非常耗时。为了训练高性能模型，必须使用大型数据集，导致长达数周的训练时间过长。在测试网络性能以进行必要调整的情况下，这尤其不幸。即使当今使用虚拟机和功能更强大的GPU，训练时间仍然是机器学习项目中的一个挑战。

Hence, how are separable convolutions relevant in this context? These networks are far more efficient, as they decrease computational complexity and require less memory during training. In addition, they tend to perform better than standard convolutions and have a wide variety of applications. Sounds unrealistic? Well, the reason lies in the number of multiplications during the training process, which will be explained in more detail at a later point. First, how did the concept begin?

因此，可分离卷积在这种情况下如何相关？这些网络效率更高，因为它们降低了计算复杂性，并且在训练过程中需要更少的内存。此外，它们往往比标准卷积性能更好，并且具有广泛的应用。听起来不现实？好吧，原因在于训练过程中的乘法次数，稍后将对此进行详细说明。首先，这个概念是如何开始的？

可分离卷积的起源 (The origins of separable convolutions)

Spatial separable convolutions have longer history than the depthwise type. They have been applied in neural networks at least since the work of Mamalet and Garcia on simplifying ConvNets for fast learning in 2012, and probably even earlier. The pioneer work on depthwise separable convolutions, on the other hand, have been inspired by research of Sifre and Mallat on transformation invariant scattering. Thereafter, Laurent Sifre, as a part of his Google Brain internship in 2013, further developed this concept and employed it in AlexNet, providing somewhat higher accuracy, as well as decrease in model size.

空间可分卷积的历史比深度卷积的历史更长。至少从Mamalet和Garcia于2012年甚至更早的时候开始为快速学习简化ConvNets的工作以来，它们就已经应用于神经网络。另一方面，关于深度方向可分卷积的先驱工作受到了Sifre和Mallat对变换不变散射的研究的启发。此后，Laurent Sifre作为其2013年Google Brain实习的一部分，进一步发展了这一概念并将其应用于AlexNet，从而提供了更高的准确性，并且减小了模型尺寸。

The following year and the year after, Szegedy with colleagues used separable convolutions as the first layer of the famous Inception V1, while Ioffe and Szegedy did the same in Inception V2. In 2015, Jin with colleagues and Wang with colleagues in 2016 applied separable convolutions for decreasing the size and computational cost of convolutional neural networks. A year later, Abadi with colleagues implemented depthwise separable convolutions in the TensorFlow framework, significantly facilitating further work with the concept.

第二年和第二年，塞格迪和同事们将可分离的卷积用作著名的盗梦空间V1的第一层，而艾菲和塞格迪则在盗梦空间V2中进行了相同的处理。 2015年，Jin和同事以及Wang和同事在2016年应用了可分离卷积来减少卷积神经网络的大小和计算成本。一年后，阿巴迪(Abadi)及其同事在TensorFlow框架中实现了深度可分离卷积，极大地促进了该概念的进一步工作。

Probably the most known applications of these convolutions come from the work of Howard with colleagues, who introduced efficient mobile models under the name MobileNets, and Chollet, who applied them in Xception architecture. Now, let’s get into more detail on what this concept is all about.

这些卷积的最著名的应用可能来自Howard与同事的工作，后者以MobileNets的名义引入了高效的移动模型，而Chollet则将其应用于Xception架构中。现在，让我们更详细地了解这个概念的含义。

空间可分卷积 (Spatial separable convolutions)

The first version of separable convolutions deals primarily with the spatial dimensions of an image and kernel — its height and width. It divides a kernel into two smaller kernels, where most common is to divide a 3x3 kernel into a 3x1 and 1x3 kernel. Hence, instead of conducting one convolution with 9 multiplications, two convolutions with 3 multiplications each are done (6) achieving the same effect.

可分离卷积的第一个版本主要处理图像和内核的空间尺寸-高度和宽度。它将一个内核分为两个较小的内核，其中最常见的是将一个3x3内核分为一个3x1和1x3内核。因此，代替进行一次具有9个乘法的卷积，而是执行两次具有3个乘法的卷积(6)，以实现相同的效果。

Fewer multiplications → lower computational complexity → faster network

乘法次数减少→计算复杂度降低→网络速度更快

Wang (2018)

王(2018)

One of the most famous kernels that can be separated spatially is Sobel (for edge detection):

可以在空间上分离的最著名的内核之一是Sobel(用于边缘检测)：

In addition, there are fewer matrix multiplications in spatially separable convolution compared to standard convolutions:

另外，与标准卷积相比，空间可分离卷积中的矩阵乘法更少：

Bai (2019)

白(2019)

However, despite their advantages, spatial separable convolutions are seldom applied in deep learning. This is mainly due to not all kernels being able to get divided into two smaller ones. Replacing all standard convolutions by spatial separable would also introduce a limit in searching for all possible kernels in the training process, implying worse training results.

但是，尽管具有优势，但空间可分卷积很少用于深度学习。这主要是由于并非所有内核都能分为两个较小的内核。通过空间可分离来替换所有标准卷积还会在搜索过程中搜索所有可能的核时引入限制，这意味着较差的训练结果。

Depthwise separable convolutions

深度可分离卷积

The second version of separable convolutions deals with kernels that cannot be divided into two smaller ones and is more commonly used. This is the type of separable convolution implemented in Keras and TensorFlow. In addition to spatial dimensions, it also works with the depth dimension (number of channels). This convolution separates convolutional process into two stages — depthwise and pointwise.

可分离卷积的第二个版本处理不能被分为两个较小的内核的内核，这种内核更常用。这是在Keras和TensorFlow中实现的可分离卷积的类型。除空间尺寸外，它还与深度尺寸(通道数)一起使用。这种卷积将卷积过程分为两个阶段-深度方向和点方向。

Here are a couple of examples of their advantages, as well as their incorporation in the Xception architecture:

以下是几个例子，说明了它们的优点以及将它们合并到Xception体系结构中的方法：

Sifre (2014)

锡弗尔(2014)

MobileNets的实验评估(Experimental evaluation of MobileNets)

Howard et al. (2017)

霍华德等。 (2017)

Xception的实验评估(Experimental evaluation of Xception)

Chollet (2017)

乔莱特(2017)

So, how do depthwise separable convolutions achieve such significantly better results? We should look for the answer in how they differ from the standard (dense) convolutions:

那么，深度可分离卷积如何获得如此明显更好的结果？我们应该寻找它们与标准(密集)卷积有何不同的答案：

Bai (2019)

白(2019)

Let us examine the number of multiplications:

让我们检查乘法的数量：

In case this did not seem impressive enough, let us look at an example with larger dimensions:

如果这看起来还不够令人印象深刻，那么让我们来看一个更大尺寸的示例：

Hence,

因此，

Fewer computations → faster network

更少的计算→更快的网络

For more information on how to implement separable convolutions in your own model, there is a thorough Keras documentation on the topic:

有关如何在您自己的模型中实现可分离卷积的更多信息，有关于该主题的详尽的Keras文档：

Chollet et al. (2015)

Chollet等。 (2015年)

广泛应用的例子 (Examples of their wide application)

As previously mentioned, separable convolutions have gained a noticeable popularity in the field of machine learning and, specifically, computer vision over the last few years, with a number of cases emphasizing their advantages and simplicity of usage. In this section, some of the most relevant examples are presented, both in order to show variety of their application and for inspiration due to their achievements:

如前所述，在过去的几年中，可分卷积在机器学习领域，特别是在计算机视觉领域中已经获得了显着的普及，其中许多案例都强调了其优势和使用的简便性。在本节中，将介绍一些最相关的示例，以展示其应用的多样性以及由于其成就而获得的启发：

ShuffleNet

洗牌网

In 2017, Zhang with colleagues introduced a highly computation-efficient CNN architecture named ShuffleNet, designed specifically for mobile devices with very limited computing power. This architecture utilizes two operations — pointwise group convolution and channel shuffle, in order to retain accuracy while significantly reducing computation costs. The results showed that ShuffleNet maintains comparable accuracy whilst achieving approximately 13 times actual speedup over AlexNet on an ARM-based mobile device.

2017年，Zhang和同事们引入了一种名为ShuffleNet的高效计算效率的CNN架构，该架构专为计算能力非常有限的移动设备设计。该体系结构利用了两种操作-点向组卷积和通道混洗，以在保持准确性的同时显着降低计算成本。结果表明，在基于ARM的移动设备上，ShuffleNet可以保持相当的精度，而实际速度是AlexNet的13倍左右。

Zhang et al. (2017)

张等。 (2017)

网络 (clcNet)

Zhang (2018) suggested that depthwise convolution and grouped convolution can be considered as special cases of a generalized convolution operation named channel local convolution (CLC), where an output channel is computed using a subset of the input channels. This definition entails computation dependency relations between input and output channels, which can be represented by a channel dependency graph (CDG). By modifying the CDG of grouped convolution, a new CLC kernel named interlaced grouped convolution (IGC) is created.

Zhang(2018)提出，深度卷积和分组卷积可被视为称为通道局部卷积(CLC)的广义卷积运算的特殊情况，其中使用输入通道的子集计算输出通道。此定义需要输入和输出通道之间的计算依赖关系，该关系可以由通道依赖图(CDG)表示。通过修改分组卷积的CDG，创建了一个名为隔行分组卷积(IGC)的新CLC内核。

Stacking IGC and GC kernels results in a convolution block (named CLC Block) for approximating regular convolution. By resorting to the CDG as an analysis tool, the rule was derived for setting the meta-parameters of IGC and GC, as well as the framework for minimizing the computational costs. Hence, A CNN model named clcNet was constructed using CLC blocks, showing significantly higher computational efficiency and fewer parameters compared to state-of-the-art networks tested using the ImageNet1K dataset.

堆叠IGC和GC内核会得到一个卷积块(称为CLC块)，用于逼近常规卷积。通过使用CDG作为分析工具，得出了用于设置IGC和GC的元参数的规则，以及用于最小化计算成本的框架。因此，使用CLC块构建了一个名为clcNet的CNN模型，与使用ImageNet1K数据集测试的最新网络相比，它显示出显着更高的计算效率和更少的参数。

Channel dependency graphs (CDG)

通道相关图(CDG)

Convolution:

卷积：

a) Regular

a)常规

b) Grouped

b)分组

c) Depthwise

c)深度

Convolution blocks:

卷积块：

a) ResNet bottleneck structure

a)ResNet瓶颈结构

b) ResNeXt block

b)ResNeXt块

c) Depthwise separable convolution in MobileNet & Xception

c)MobileNet和Xception中的深度可分离卷积

Zhang (2018)

张(2018)

网络解耦 (Network Decoupling)

Also in 2018, Guo with colleagues analyzed mathematical relationship between regular convolutions and depthwise separable convolutions, and showed that the former could be approximated with the latter in closed form. Depthwise separable convolutions were indicated as principal components of regular convolutions. Moreover, network decoupling (ND) was proposed, a training-free method in order to accelerate CNNs by transferring pre-trained CNN models into the MobileNet-like depthwise separable convolution structure, with a promising speed up and negligible accuracy loss.

同样在2018年，Guo及其同事分析了常规卷积和深度可分离卷积之间的数学关系，并表明前者可以与后者以封闭形式近似。深度可分离卷积表示为常规卷积的主要成分。此外，提出了一种网络解耦(ND)的方法，它是一种免训练方法，通过将预先训练的CNN模型转移到类似MobileNet的深度可分离卷积结构中来加速CNN，而这种方法有希望的提速和精度损失可忽略不计。

In addition, it was experimentally verified that this method is orthogonal to other training-free methods (e.g. channel decomposition, spatial decomposition, etc.). Combining them reached even larger CNN speed up. Finally, ND’s wide applicability to classification networks and object detection networks was demonstrated.

另外，通过实验验证了该方法与其他无训练方法(例如通道分解，空间分解等)正交。合并它们可以达到更大的CNN速度。最后，展示了ND在分类网络和目标检测网络中的广泛适用性。

Guo et al. (2018)

郭等。 (2018)

ChannelNets (ChannelNets)

Gao et al. (2018) proposed compressing deep models by using channel-wise convolutions, which replace dense connections among feature maps with sparse ones in CNNs. Based on this operation, light-weight CNNs were built known as ChannelNets. ChannelNets use three instances of channel-wise convolutions — group channel-wise convolutions, depth-wise separable channel-wise convolutions, and the convolutional classification layer. Compared to prior CNNs designed for mobile devices, ChannelNets achieve a significant reduction in terms of the number of parameters and computational cost without loss in accuracy. This was the first attempt to compress the fully-connected classification layer, which usually accounts for about 25 % of total parameters in compact CNNs. Experimental results on the ImageNet dataset demonstrated that ChannelNets achieve consistently better performance compared to prior methods.

高等。 (2018)提出了使用通道级卷积压缩深度模型的方法，该方法将特征图之间的密集连接替换为CNN中的稀疏连接。基于此操作，构建了轻量级的CNN，称为ChannelNet。 ChannelNet使用通道方式卷积的三个实例-组通道方式卷积，深度方式可分离的通道方式卷积和卷积分类层。与为移动设备设计的现有CNN相比，ChannelNets大大减少了参数数量和计算成本，而没有损失准确性。这是第一次尝试压缩完全连接的分类层，该层通常占紧凑型CNN中总参数的25％。在ImageNet数据集上的实验结果表明，与以前的方法相比，ChannelNets始终具有更好的性能。

Different compact convolutions:

不同的紧凑卷积：

a) Depth-wise separable convolution

a)深度可分离卷积

b) Group convolution

b)群卷积

c) Group channel-wise convolution for information fusion

c)分组通道卷积信息融合

d) Depth-wise separable channel-wise convolution

d)深度方向可分离的通道方向卷积

Gao et al. (2018)

高等。 (2018)

Encoder-Decoder with Atrous Separable Convolution

具有可分卷积的编解码器

Also in 2018, Chen with colleagues proposed to combine the advantages from spatial pyramid pooling module and encode-decoder structure used in deep neural networks for semantic segmentation task. Specifically, the DeepLabv3+ model extended DeepLabv3 by adding a simple, yet effective decoder module to refine the segmentation results especially along object boundaries. In addition, Xception model was explored and depthwise separable convolution applied to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network.

同样在2018年，Chen和同事提出将深度金字塔网络中用于语义分割任务的空间金字塔池模块和编码解码器结构的优势相结合。具体来说，DeepLabv3 +模型通过添加一个简单而有效的解码器模块扩展了DeepLabv3，以优化分割结果，尤其是沿对象边界的分割结果。此外，还探索了Xception模型并将深度可分离卷积应用于Atrous空间金字塔池和解码器模块，从而形成了更快，更强大的编码器-解码器网络。

Chen et al. (2018)

Chen等。 (2018)

基于CNN的LF SSR方法 (CNN-based methods for LF SSR)

Yeung et al. (2019) proposed effective and efficient end-to-end convolutional neural network models for spatially super-resolving light field (LF) images. Specifically, these models have an hourglass shape, which allows feature extraction to be performed at the low-resolution level in order to save both the computational and memory costs. With the aim to fully make use of the 4D structure information of LF data in both the spatial and angular domains, 4D convolution was proposed to characterize the relationship among pixels. Moreover, as an approximation of 4D convolution, spatial-angular separable (SAS) convolutions were also proposed for more computationally and memory-efficient extraction of spatial-angular joint features.

Yeung等。 (2019)为空间超分辨光场(LF)图像提出了有效的端到端卷积神经网络模型。具体来说，这些模型具有沙漏形状，可在低分辨率级别执行特征提取，从而节省计算和存储成本。为了在空间域和角度域中充分利用LF数据的4D结构信息，提出了4D卷积来表征像素之间的关系。此外，作为4D卷积的近似值，还提出了空间角可分离(SAS)卷积，以提高计算和内存效率地提取空间角联合特征。

Extensive experimental results on 57 test LF images with various challenging natural scenes showed significant advantages from the proposed models over the state-of-the-art methods. Specifically, an average PSNR gain of more than 3.0 dB and higher visual quality were achieved, while these methods preserved the LF structure of the super-resolved LF images better, which is highly desirable for subsequent applications. In addition, the SAS convolution-based model is able to achieve three times speed-up with only negligible reconstruction quality decrease, as compared to the 4D convolution-based.

在具有各种具有挑战性的自然场景的57张测试LF图像上的大量实验结果表明，与最新技术方法相比，所提出的模型具有显着优势。具体而言，获得了大于3.0 dB的平均PSNR增益和更高的视觉质量，而这些方法更好地保留了超分辨LF图像的LF结构，这对于后续应用非常需要。此外，与基于4D卷积的模型相比，基于SAS卷积的模型能够实现三倍的提速，而重建质量的下降却微不足道。

Yeung et al. (2019)

Yeung等。 (2019)

朱网 (Zhu-Net)

Also in 2019, Zhang with colleagues designed a new CNN network structure in order to improve detection accuracy of spatial domain steganography. 3 x 3 kernels were used instead of the traditional 5 x 5 and convolution kernels optimized in the preprocessing layer. The smaller convolution kernels were applied in order to reduce the number of parameters and model features in a small local region. Further, separable convolutions were employed with the aim to utilize channel correlation of the residuals, compress the image content and increase the signal-to-noise ratio (between the stego signal and the image signal).

同样在2019年，Zhang和同事设计了一种新的CNN网络结构，以提高空间域隐写术的检测精度。使用3 x 3内核代替了在预处理层中优化的传统5 x 5和卷积内核。应用了较小的卷积核，以减少局部区域中参数和模型特征的数量。此外，采用可分离的卷积以利用残差的信道相关性，压缩图像内容并增加信噪比(隐秘信号与图像信号之间)。

Moreover, spatial pyramid pooling (SPP) was used in order to aggregate the local features and enhance their representation ability by multi-level pooling. Finally, data augmentation was adopted to further improve network performance. The experimental results showed that this CNN structure is significantly better than other five methods (SRM, Ye-Net, Xu-Net, Yedroudj-Net and SRNet) in detecting three spatial algorithms (WOW, S-UNIWARD and HILL) with a wide variety of datasets and payloads.

此外，使用空间金字塔池(SPP)来聚集局部特征并通过多级池增强其表示能力。最后，采用数据增强来进一步提高网络性能。实验结果表明，该CNN结构在检测多种多样的三种空间算法(WOW，S-UNIWARD和HILL)方面明显优于其他五种方法(SRM，Ye-Net，Xu-Net，Yedroudj-Net和SRNet)。数据集和有效载荷。

Zhang et al. (2019)

张等。 (2019)

总之，可分离卷积的主要作用是什么？ (In conclusion, what are the key take-outs from separable convolutions?)

Investigating the two types of separable convolutions (spatial and depthwise), it is important to mention that they both save computational power, while demanding less memory compared to standard convolutions. Spatial separable convolutions are simpler of the two, dealing primarily with the spatial dimensions of an image and kernel. However, they are rarely used in deep learning, since not all kernels can be divided into two smaller ones as they require. In addition, this version of separable convolutions limits searching for all possible kernels during training, implying that training results may be suboptimal.

研究两种类型的可分卷积(空间卷积和深度卷积)时，重要的是要提到它们都节省了计算能力，同时与标准卷积相比需要更少的内存。空间可分离卷积是二者中较简单的卷积，主要处理图像和核的空间尺寸。但是，它们很少用于深度学习，因为并非所有内核都可以根据需要将其分为两个较小的内核。另外，此版本的可分离卷积限制了在训练期间搜索所有可能的内核，这意味着训练结果可能不是最佳的。

Depthwise separable convolutions, on the other hand, work with the depth dimension (number of channels) in addition to spatial dimensions. They drastically enhance efficiency without significantly reducing effectiveness, as they learn richer representations with fewer parameters. On the downside, this decrease in the number of parameters is suboptimal for small networks. Hence, the numerous advantages of depthwise separable convolutions come best to light when applied to large networks in neural computer vision architectures and may likely become a foundation for their future design due to ease of use as standard convolutional layers.

另一方面，深度可分离的卷积除空间尺寸外，还与深度尺寸(通道数)一起使用。他们用更少的参数学习到更丰富的表示形式，从而大大提高了效率，而又没有显着降低有效性。不利的一面是，对于小型网络，参数数量的减少并不理想。因此，当应用于神经计算机视觉体系结构中的大型网络时，深度可分离卷积的众多优点最明显，并且由于易于用作标准卷积层而可能成为其未来设计的基础。

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., … & Ghemawat, S. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.

Abadi，M.，Agarwal，A.，Barham，P.，Brevdo，E.，Chen Z.，Citro，C.，…＆Ghemawat，S.(2016年)。 Tensorflow：在异构分布式系统上的大规模机器学习。 arXiv预印本arXiv：1603.04467。

Bai, K. (2019). A Comprehensive Introduction to Different Types of Convolutions in Deep Learning. Towards Data Science. Assessed on October 25th, 2019

Bai，K.(2019年)。深度学习中不同类型卷积的综合介绍。走向数据科学。于2019年10月25日评估

Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F. & Adam H. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV). 801–818.

Chen，LC，Zhu，Y.，Papandreou，G.，Schroff，F.和Adam H.(2018年)。带有可分离卷积的编码器/解码器用于语义图像分割。在欧洲计算机视觉会议(ECCV)的会议记录中。 801–818。

Chollet, F. (2015). Keras.

Chollet，F.(2015年)。凯拉斯。

Chollet, F. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1251–1258.

Chollet，F.(2017年)。 Xception：深度学习与深度可分卷积。 IEEE计算机视觉与模式识别会议论文集。 1251-1258。

Gao, H., Wang, Z. & Ji, S. (2018). ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions. In Advances in Neural Information Processing Systems. 5197–5205.

高洪，王中和纪南(2018)。 ChannelNets：通过Channel-Wise卷积的紧凑而有效的卷积神经网络。神经信息处理系统进展。 5197-5205。

Guo, J., Li, Y., Lin, W., Chen, Y., Li, J. (2018). Network Decoupling: From Regular to Depthwise Separable Convolutions. arXiv preprint arXiv:1808.05517.

郭洁，李艳，林伟，陈艳，李洁(2018)。网络解耦：从常规到深度可分离卷积。 arXiv预印本arXiv：1808.05517。

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D. (2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv preprint arXiv:1704.04861.

Howard，AG，Zhu，M.，Chen B.，Kalenichenko，D.(2017年)。 MobileNets：用于移动视觉应用的高效卷积神经网络。 arXiv预印本arXiv：1704.04861。

Ioffe, S. & Szegedy, C. (2015) — Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv prepring arXiv:1502.03167.

Ioffe，S.＆Szegedy，C.(2015)—批量标准化：通过减少内部协变量偏移来加速深度网络训练。 arXiv准备版arXiv：1502.03167。

Jin, J., Dundar, A. & Culurciello, E. (2015). Flattened Convolutional Neural Networks for Feedforward Acceleration. arXiv preprint arXiv:1412.5474.

Jin，J.，Dundar，A.和Culurciello，E.(2015)。用于前馈加速的扁平卷积神经网络。 arXiv预印本arXiv：1412.5474。

Mamalet, F. & Garcia, C. (2012) — Simplifying ConvNets for Fast Learning. In International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 58–65.

Mamalet，F.和Garcia，C.(2012)—简化用于快速学习的卷积网络。在国际人工神经网络会议上。施普林格，柏林，海德堡，58–65。

Sifre, L. (2014) — Rigid-Motion Scattering For Image Classification. PhD Thesis. Ecole Polytechnique, CMAP.

Sifre，L.(2014)-图像分类的刚性运动散射。博士论文。巴黎高等理工学院。

Sifre, L. & Mallat, S. (2013) — Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1233–1240.

Sifre，L.和Mallat，S.(2013)-旋转，缩放和变形不变散射，用于纹理识别。 IEEE计算机视觉与模式识别会议论文集。 1233年至1240年。

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. & Rabinovich, A. (2014) — Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9.

Szegedy，C.，Liu，W.，Jia，Y.，Sermanet，P.，Reed，S.，Anguelov，D.，Erhan，D.，Vanhoucke，V.＆Rabinovich，A.(2014)—更深入卷积。 IEEE计算机视觉与模式识别会议论文集。 1–9。

Wang, C. F. (2018). A Basic Introduction to Separable Convolutions. Towards Data Science. Assessed on October 5th, 2019

Wang CF(2018)。可分离卷积的基本介绍。走向数据科学。于2019年10月5日评估

Wang, M, Liu, B. & Foroosh, H. (2016). Factorized Convolutional Neural Networks. In Proceedings of the IEEE International Conference on Computer Vision. 545–553.

Wang M，Liu B.＆Foroosh，H.(2016年)。分解卷积神经网络。 IEEE国际计算机视觉会议论文集。 545–553。

Yeung, H. W. F., Hou, J., Chen, X., Chen, J., Chen, Z. & Chung, Y. Y. (2019). Light Field Spatial Super-Resolution Using Deep Efficient Spatial-Angular Separable Convolution. IEEE Transactions on Image Processing, 28(5): 2319–2330.

杨，HWF，侯J.，Chen X.，Chen，J.，Chen Z.＆Chung，YY(2019)。使用深效率空间角可分离卷积的光场空间超分辨率。 IEEE Transactions on Image Processing，28(5)：2319-2330。

Zhang, D. Q. (2018). clcNet: Improving the Efficiency of Convolutional Neural Network using Channel Local Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7912–7919.

张大千(2018)。 clcNet：使用通道局部卷积提高卷积神经网络的效率。 IEEE计算机视觉与模式识别会议论文集。 7912–7919。

Zhang, R., Zhu, F., Liu, J. & Liu, G. (2019). Depth-wise separable convolutions and multi-level pooling for an efficient spatial CNN-based steganalysis. IEEE Transactions on Information Forensics and Security.

Zhang，R.，Zhu，F.，Liu，J.＆Liu，G.(2019年)。深度可分离卷积和多层池，用于基于空间CNN的高效隐写分析。 IEEE信息取证与安全事务。

Zhang, X., Zhou, X., Lin, M. & Sun, J. (2017). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6848–6856.

Zhang X.，Zhou X.，Lin，M.＆Sun，J.(2017年)。 ShuffleNet：一种用于移动设备的极其高效的卷积神经网络。 IEEE计算机视觉与模式识别会议论文集。 6848–6856。