Understanding Locally Connected Layers In Convolutional Neural Networks

本文探讨了卷积神经网络中局部连接层的概念及其与卷积层的区别。重点介绍了局部连接层如何通过使用未共享权重来提高模型性能,并讨论了其在DeepFace等应用中的表现。


1 mainConvolutional Neural Networks (CNNs) have been phenomenal in the field of image recognition. Researchers have been focusing heavily on building deep learning models for various tasks and they just keeps getting better every year. As we know, a CNN is composed of many types of layers like convolution, pooling, fully connected, and so on. Convolutional layers are great at dealing with image data, but there are a couple of restrictions as well. The DeepFace network built by Facebook used another type of layer to speed up their training and get amazing results. This layer is called Locally Connected Layer with unshared weights. So what exactly does it do that other layers can’t?  

Fully connected and convolutional layers

Before we start discussing locally connected layers, we need to understand where it comes from. Back when neural networks started gaining traction, people were heavily into fully connected layers. It’s basically connected all the neurons in one layer to all the neurons in the next layers. Let’s see what a fully connected and convolutional layers look like:

2 fc and conv

The one on the left is the fully connected layer. The figure on the right indicates convolutional layer operating on a 2D image. This gives a lot of freedom for the neural network to train and optimize all the parameters. But at the same time, it’s computationally intensive! So in order to take advantage of the 2D structure of image data, researchers came up with convolutional layers.

On each 2D array of data, we train a whole bunch of N x N kernels. These kernels are nothing but filters that we run across the 2D array. For each position (x,y), we compute the dot product summation between this kernel and the image values around that point. This process is called convolution, hence the name convolutional neural networks. As we can see in the image above, the kernels will remain the same everywhere (as indicated by the green color).

We’re like half way through and we still haven’t talked about “locally connected” layers

Well, that’s not exactly true! Convolutional layers are technically locally connected layers. To be precise, they are locally connected layers with shared weights. We run the same filter for all the (x,y) positions in the image. In other words, all the pixel positions “share” the same filter weights. We allow the network to tune the filter weights until we arrive at the desired performance. While this is great at image classification tasks, we tend to miss out on some subtle nuances of spatial arrangements.

4 lcSo researchers tried out an approach where they have a different filter for each (x,y) position. There’s no convolution as such! It’s just a dot product at each pixel position. Let’s say we want train 256 filters of size 5×5 at the current stage of the network.Here, each filter learns to detect different aspects of the image. The input image is of size, say, 128 x 128. We can fit 124 x 124 filters in that image grid (draw it on a piece of paper to verify). As seen from the figure to the left, the different colors indicate different filters. In a convolutional layer, we just need to train 5 x 5 x 256 number of parameters. But if we are dealing with locally connected layers with unshared weights, we will be dealing with 5 x 5 x 256 x 124 x 124.

That’s a lot of parameters

Now you might think — Won’t that result in an enormous number of hyperparameters? Because we are building a separate filter for each pixel position and there are a lot of positions in the image. The answer is yes! But they have optimized the methods and created a fast implementation on the GPU. Also, since every pixel position gets its own filter, they reduce the number of overall filters that need to be trained. The locally connected layers with unshared weights seemed to work well for the DeepFace architecture. It is yet to undergo rigorous testing to see if it can generalize well for other types of image recognition tasks.

In essence, this layer is just like a convolutional layer in the deep neural network, but without any sharing of weights. If you keep this fact aside, it’s pretty much the same as a convolutional layer.


https://prateekvjoshi.com/2016/04/12/understanding-locally-connected-layers-in-convolutional-neural-networks/


<think> 我们正在讨论局部连接层(Locally Connected Layers)与卷积层(Convolutional Layers)的区别。根据用户提供的引用和之前的对话内容,我们可以从以下几个方面进行对比: 1. **权重共享机制**: - 卷积层:使用权重共享(weight sharing),即在整个输入图像上使用相同的卷积核(filter)。这使得卷积层具有平移不变性(translation invariance)的特性,并且大大减少了参数量。 - 局部连接层:不共享权重,每个局部区域都有自己独立的滤波器。也就是说,对于输入的不同位置,使用不同的权重。因此,局部连接层的参数量会显著增加。 2. **参数数量**: - 假设输入特征图大小为$H \times W \times C$,卷积核大小为$K \times K$,输出通道数为$D$。 - 卷积层参数数量:$K \times K \times C \times D$(因为权重共享,所以参数量与输入的空间尺寸无关)。 - 局部连接层参数数量:$H \times W \times K \times K \times C \times D$(因为每个空间位置都有独立的滤波器,所以参数量与输入的空间尺寸成正比)。 3. **特征提取特性**: - 卷积层:由于权重共享,卷积层学习的是空间不变的特征,即无论特征出现在图像的哪个位置,都可以被同一个卷积核检测到。这有利于泛化。 - 局部连接层:由于每个位置有独立的权重,它可以学习到位置特定的特征。这意味着它对于不同位置可以捕获不同的模式,但可能缺乏平移不变性。 4. **计算开销**: - 局部连接层的计算量通常远大于卷积层,因为每个位置的滤波器都需要独立计算。因此,在实际应用中,卷积层更为常见,尤其是在处理大尺寸图像时。 5. **应用场景**: - 卷积层:广泛应用于图像处理的各个领域(如分类、检测、分割等),是卷积神经网络的基础。 - 局部连接层:在需要位置特异性特征的场景中可能有优势(例如,人脸识别中眼睛和嘴巴的位置相对固定,可以使用局部连接层来捕获特定位置的特征)。DeepFace中使用了局部连接层来代替全连接层,以保留位置信息[^1][^2]。 引用[1]中提到,在DeepFace中,局部连接层的实现是通过分层聚类来定义局部邻域(neighborhoods),并且每一层的神经元数量通过聚类决定。这种结构被称为深度局部连接网络(Deep Locally Connected Networks)。引用[4]则指出,局部连接层在图上通过聚类进行下采样(pooling),类似于卷积神经网络中的池化操作。 总结如下表: | **特性** | **卷积层 (Convolutional Layer)** | **局部连接层 (Locally Connected Layer)** | |------------------|-----------------------------------------------|------------------------------------------------| | 权重共享 | 是,整个输入共享同一组卷积核 | 否,每个位置有独立的滤波器 | | 参数量 | 与输入空间尺寸无关 | 与输入空间尺寸成正比(通常非常大) | | 计算效率 | 高 | 低 | | 平移不变性 | 是 | 否 | | 特征提取特性 | 空间不变特征 | 位置相关特征 | | 典型应用 | 通用图像识别任务 | 特定位置特征(如人脸关键区域) | 在DeepFace中,局部连接层被用于网络的后几层,以保持局部特征的空间信息,这对于精确的人脸识别是有益的。然而,由于参数量过大,现代网络通常使用卷积层,并通过其他技术(如数据增强、更大的数据集)来提高性能。 相关问题: 1. 局部连接层在哪些实际应用中比卷积层更有优势? 2. 如何减少局部连接层的参数数量以提高效率? 3. 除了DeepFace,还有哪些经典模型使用了局部连接层? </think> ### 局部连接层(Locally Connected Layers)与卷积层的区别 局部连接层和卷积层都是深度学习中处理空间数据的核心组件,但它们在参数共享机制和特征提取方式上有本质区别: #### 1. **参数共享机制** - **卷积层**: - 使用权重共享(weight sharing),同一卷积核在整个输入空间滑动 - 参数量计算公式: $$ \text{Params} = K_w \times K_h \times C_{in} \times C_{out} $$ ($K_w, K_h$=卷积核尺寸,$C_{in}$=输入通道数,$C_{out}$=输出通道数) - 优势:参数效率高,具备平移不变性[^2] - **局部连接层**: - **无权重共享**,每个空间位置有独立的滤波器 - 参数量计算公式: $$ \text{Params} = H_{in} \times W_{in} \times K_w \times K_h \times C_{in} \times C_{out} $$ ($H_{in}, W_{in}$=输入特征图尺寸) - 示例:12×12输入经局部连接层输出6×6特征时,参数量是卷积层的**36倍**(见图1)[^1][^4] #### 2. **特征提取特性** | 特性 | 卷积层 | 局部连接层 | |--------------------|-------------------------|-------------------------| | 空间泛化能力 | 强(全局特征响应) | 弱(位置敏感特征) | | 对局部变形的鲁棒性 | 高(共享滤波器) | 低(位置专属滤波器) | | 典型应用场景 | 通用图像识别 | 结构固定对象(如人脸) | #### 3. **计算结构差异 ```plaintext # 卷积层结构(参数共享) Input → [Filter W] → Output │ │ └────────┘ # 所有位置共享W # 局部连接层结构(无参数共享) Input → [Filter W₁] → Output位置1 Input → [Filter W₂] → Output位置2 # 每个位置独立W ... ``` > 在DeepFace中,局部连接层处理3D对齐后的人脸时,能捕获**固定位置**的特征(如眼睛/鼻子区域),这对人脸识别至关重要[^1][^2] #### 4. **层级构建方式 - 卷积层:通过**池化**(pooling)实现下采样 - 局部连接层:通过**聚类**实现分层抽象(如12节点→6节点→3节点)[^1][^4] $$ \Omega_k = \text{Cluster}(\Omega_{k-1}) $$ 其中$\Omega_k$表示第k层的信号集合 --- ### 总结对比 | 维度 | 卷积层 | 局部连接层 | |--------------|---------------------------------|---------------------------------| | **参数共享** | 全局共享 | 无共享(位置独立) | | **参数量** | 少(与输入尺寸无关) | 多(与输入尺寸成正比) | | **特征提取** | 平移不变特征 | 位置敏感特征 | | **计算效率** | 高 | 低 | | **适用场景** | 通用CV任务(如ImageNet分类) | 结构化数据(如人脸、医学影像) | 在DeepFace中,局部连接层与3D对齐配合使用:对齐确保人脸关键点位置固定,局部连接层则能针对性学习这些位置的鉴别性特征,从而提升识别准确率[^1][^2]。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值