ReceptiveField Compute

本文详细解析了卷积神经网络(CNN)中各层的计算过程,包括卷积层、池化层等,并通过具体参数展示了如何计算每一层的特征数、跳跃距离、感受野大小及起始位置。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

# [filter size, stride, padding]
#Assume the two dimensions are the same
#Each kernel requires the following parameters:
# - k_i: kernel size
# - s_i: stride
# - p_i: padding (if padding is uneven, right padding will higher than left padding; "SAME" option in tensorflow)

#Each layer i requires the following parameters to be fully represented: 
# - n_i: number of feature (data layer has n_1 = imagesize )
# - j_i: distance (projected to image pixel distance) between center of two adjacent features
# - r_i: receptive field of a feature in layer i
# - start_i: position of the first feature's receptive field in layer i (idx start from 0, negative means the center fall into padding)


import math
convnet =   [[11,4,0],[3,2,0],[5,1,2],[3,2,0],[3,1,1],[3,1,1],[3,1,1],[3,2,0],[6,1,0], [1, 1, 0]]
layer_names = ['conv1','pool1','conv2','pool2','conv3','conv4','conv5','pool5','fc6-conv', 'fc7-conv']
imsize = 227


def outFromIn(conv, layerIn):
        n_in = layerIn[0]
        j_in = layerIn[1]
        r_in = layerIn[2]
        start_in = layerIn[3]
        k = conv[0]
        s = conv[1]
        p = conv[2]
        
        n_out = math.floor((n_in - k + 2*p)/s) + 1
        actualP = (n_out-1)*s - n_in + k 
        pR = math.ceil(actualP/2)
        pL = math.floor(actualP/2)
        
        j_out = j_in * s
        r_out = r_in + (k - 1)*j_in
        start_out = start_in + ((k-1)/2 - pL)*j_in
        return n_out, j_out, r_out, start_out


def printLayer(layer, layer_name):
        print(layer_name + ":")
        print("\t n features: %s \n \t jump: %s \n \t receptive size: %s \t start: %s " % (layer[0], layer[1], layer[2], layer[3]))


layerInfos = []
if __name__ == '__main__':
#first layer is the data layer (image) with n_0 = image size; j_0 = 1; r_0 = 1; and start_0 = 0.5
        print ("-------Net summary------")
        currentLayer = [imsize, 1, 1, 0.5]
        printLayer(currentLayer, "input image")
        for i in range(len(convnet)):
                currentLayer = outFromIn(convnet[i], currentLayer)
        layerInfos.append(currentLayer)
        printLayer(currentLayer, layer_names[i])
        print ("------------------------")
        layer_name = raw_input ("Layer name where the feature in: ")
        layer_idx = layer_names.index(layer_name)
        idx_x = int(raw_input ("index of the feature in x dimension (from 0)"))
        idx_y = int(raw_input ("index of the feature in y dimension (from 0)"))
        
        n = layerInfos[layer_idx][0]
        j = layerInfos[layer_idx][1]
        r = layerInfos[layer_idx][2]
        start = layerInfos[layer_idx][3]
        assert(idx_x < n)
        assert(idx_y < n)


print ("receptive field: (%s, %s)" % (r, r))
print ("center: (%s, %s)" % (start+idx_x*j, start+idx_y*j))
### 局部感受野的概念 在卷积神经网络(CNN)中,局部感受野是指某一层输出特征图(feature map)上的单个像素点所对应的上一层输入数据的空间范围。这一概念强调的是当前层的某个特定位置如何通过有限区域内的一组权重来感知其输入的信息[^2]。 更具体地说,在卷积操作过程中,每一个滤波器(filter/kernel)都会滑动覆盖输入图像的一部分区域,并对该部分执行加权求和运算以生成新的特征表示。这种机制使得每一步仅关注于一个小窗口内的信息,从而形成所谓的“局部连接”。因此,对于任意给定的输出单元而言,它实际上只依赖于原始输入的一个子集——即它的局部感受野[^3]。 ### 感受野的计算方法 为了理解并量化这些局部连接关系,可以采用递归公式来精确地追踪每个层次间的影响程度变化情况: 设第 l 层的感受野大小为 \(R_l\) ,步幅(stride)为 s,则有如下表达式成立: \[ R_{l} = (R_{l-1}-1)\times S + F \] 其中, - \(F\) 表示该层使用的过滤器尺寸; - \(S\) 是指跨距(step size),也就是常说的 stride 参数值; 初始条件通常设定为基础输入层处的感受野等于单一像素单位长度 (\(R_0=1\)) 。随着层数增加以及不同参数配置的选择,最终能够得到整个模型架构下的总体有效感受野尺度[^3]。 另外需要注意一点,当引入池化(pooling)或者扩张(dilated/atrrous)类型的特殊形式卷积时,也需要相应调整上述基本公式的适用版本以便准确反映实际情形下各要素间的相互作用规律。 ```python def calculate_receptive_field(layers_info): """ Calculate the receptive field of a convolutional neural network. Args: layers_info (list): A list where each element is another list containing three integers. Each inner list represents one layer and contains [kernel_size, stride, padding]. Returns: int: The total receptive field after all specified layers have been applied sequentially. """ r = 1 j = 1 for info in layers_info: k, s, p = info # Update jump factor between pixels in current output compared to input image j_prev = j j = j * s # Compute new effective receptive field based on previous values plus kernel contribution rf_increase = ((k - 1)*j_prev) + 1 r += rf_increase - j_prev return r # Example usage with two convolutions having kernelsize=3,stride=1,padding=0 followed by maxpooling layer with poolsize=2 & no overlap(padding set as zero here too). layers_example = [[3, 1, 0], [3, 1, 0], [2, 2, 0]] print(f"The calculated Receptive Field Size is {calculate_receptive_field(layers_example)}") ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值