caffe中的normalization_layer

本文详细介绍了Caffe库中normalization_layer的实现,特别是L2正则化的应用。通过分析caffe.proto中的NormalizeParameter,探讨了关键参数across_spatial和channel_shared的作用。接着,文章深入解析了forward_cpu函数中的代码逻辑,区分了across_spatial为true和false时的不同归一化处理方式,以及channel_shared对归一化的影响。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

caffe-ssd里面有normalization的实现,包括.hpp,.cpp,.cu。其实现的是L2-normalization
L2正则化的公式是:
这里写图片描述
现在来看caffe的代码实现。
首先是caffe.proto,这里面定义了normalization_parameter
message NormalizeParameter {
optional bool across_spatial = 1 [default = true];
// Initial value of scale. Default is 1.0 for all
optional FillerParameter scale_filler = 2;
// Whether or not scale parameters are shared across channels.
optional bool channel_shared = 3 [default = true];
// Epsilon for not dividing by zero while normalizing variance
optional float eps = 4 [default = 1e-10];
}
这里面有两个很重要的参数,across_spatial和channel_shared。
accross_spatial决定了normalization的范围,如果为true的话(默认),则对每个num(channel*height*width)整体进行normalization,也就是上面xi的平方加和的个数是channel*height*width;如果是false的话,就表明normalization不是accross_spatial的,上面加和的个数是channel,也就是说,spatial中的每个像素点(height*width个数)分别进行normalization,这就大大减小了normalization的范围.
至于channel_shared。在上面的归一化完了之后,要将top_data乘以一个scale(这个scale是normalization_layer的唯一的参数),如果channel_shared为true(默认),那么top_data的所有channel都乘以同一个数,如果channel_shared为false,那么top_data的channel乘的数是不一样的。
下面看forward_cpu。

for (int n = 0; n < num; ++n) {
    caffe_sqr<Dtype>(dim, bottom_data, buffer_data);
    if (across_spatial_) {
      // add eps to avo
生成的格式如下,不适合chrome://tracing/[ { "count" : 9080 } , { "name" : "ESMM_FW_Gate_Network/hiddenlayer_0/alpha_dice:0 + ONNXTRT_Broadcast_7", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_FW_Gate_Network/gate_output_layer/alpha_dice:0 + ONNXTRT_Broadcast_17", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_Dnn_Network/dnn_first_layer/alpha_dice:0 + ONNXTRT_Broadcast_93", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_Dnn_Network/hiddenlayer_0/alpha_dice:0 + ONNXTRT_Broadcast_103", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_Dnn_Network/hiddenlayer_1/alpha_dice:0 + ONNXTRT_Broadcast_113", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "reshape_before_ESMM_FW_Gate_Network/hiddenlayer_0/MatMul", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_FW_Gate_Network/hiddenlayer_0/MatMul + ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/BatchNorm/FusedBatchNormV3", "timeMs" : 42.6197, "averageMs" : 0.00469379, "medianMs" : 0.00448, "percentage" : 2.03828 } , { "name" : "ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/batch_normalization/batchnorm/Rsqrt:0 + ONNXTRT_Broadcast + unsqueeze_node_after_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/batch_normalization/batchnorm/Rsqrt:0 + ONNXTRT_Broadcast_ONNXTRT_Broadcast_output + ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/batch_normalization/batchnorm/mul + ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/batch_normalization/batchnorm/mul_1:0 + ONNXTRT_Broadcast_3 + unsqueeze_node_after_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/batch_normalization/batchnorm/mul_1:0 + ONNXTRT_Broadcast_3_ONNXTRT_Broadcast_3_output + ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/batch_normalization/batchnorm/add_1 + ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/Sigmoid", "timeMs" : 38.245, "averageMs" : 0.004212, "medianMs" : 0.004096, "percentage" : 1.82906 } , { "name" : "squeeze_after_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/Sigmoid", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "unsqueeze_node_after_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/Sigmoid_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/Sigmoid:0", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/sub/x:0 + ONNXTRT_Broadcast_5, PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/sub)), PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul))", "timeMs" : 30.7875, "averageMs" : 0.00339069, "medianMs" : 0.00336, "percentage" : 1.47241 } , { "name" : "unsqueeze_node_after_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul:0", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "Reformatting CopyNode for Input Tensor 0 to PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_1), PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_2), PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/add)))", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "Reformatting CopyNode for Input Tensor 1 to PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_1), PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_2), PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/add)))", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "Reformatting CopyNode for Input Tensor 2 to PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_1), PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_2), PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/add)))", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_1), PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/mul_2), PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/add)))", "timeMs" : 31.0026, "averageMs" : 0.00341439, "medianMs" : 0.00336, "percentage" : 1.4827 } , { "name" : "Reformatting CopyNode for Input Tensor 0 to ESMM_FW_Gate_Network/gate_output_layer/MatMul + ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/BatchNorm/FusedBatchNormV3", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_FW_Gate_Network/gate_output_layer/MatMul + ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/BatchNorm/FusedBatchNormV3", "timeMs" : 43.6841, "averageMs" : 0.00481102, "medianMs" : 0.004608, "percentage" : 2.08919 } , { "name" : "ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/batch_normalization/batchnorm/Rsqrt:0 + ONNXTRT_Broadcast_11 + unsqueeze_node_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/batch_normalization/batchnorm/Rsqrt:0 + ONNXTRT_Broadcast_11_ONNXTRT_Broadcast_11_output + ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/batch_normalization/batchnorm/mul + ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/batch_normalization/batchnorm/mul_1:0 + ONNXTRT_Broadcast_13 + unsqueeze_node_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/batch_normalization/batchnorm/mul_1:0 + ONNXTRT_Broadcast_13_ONNXTRT_Broadcast_13_output + ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/batch_normalization/batchnorm/add_1 + ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/Sigmoid", "timeMs" : 32.7257, "averageMs" : 0.00360415, "medianMs" : 0.003648, "percentage" : 1.5651 } , { "name" : "squeeze_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/Sigmoid", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "unsqueeze_node_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/Sigmoid_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/Sigmoid:0", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "PWN(PWN(ESMM_FW_Gate_Network/hiddenlayer_0/hiddenlayer_0/sub/x:0_clone_1 + ONNXTRT_Broadcast_15, PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/sub)), PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul))", "timeMs" : 31.0739, "averageMs" : 0.00342223, "medianMs" : 0.003392, "percentage" : 1.48611 } , { "name" : "unsqueeze_node_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul:0", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "Reformatting CopyNode for Input Tensor 0 to PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_1), PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_2), PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/add)))", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "Reformatting CopyNode for Input Tensor 1 to PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_1), PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_2), PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/add)))", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "Reformatting CopyNode for Input Tensor 2 to PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_1), PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_2), PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/add)))", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_1), PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/mul_2), PWN(ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/add)))", "timeMs" : 30.9887, "averageMs" : 0.00341285, "medianMs" : 0.003392, "percentage" : 1.48203 } , { "name" : "Reformatting CopyNode for Input Tensor 0 to copied_squeeze_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/add", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "copied_squeeze_after_ESMM_FW_Gate_Network/gate_output_layer/gate_output_layer/add", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "PWN(PWN(PWN(PWN(ESMM_FW_Gate_Network/gate_output_layer/clip_by_value/Minimum/y:0 + ONNXTRT_Broadcast_19, PWN(ESMM_FW_Gate_Network/gate_output_layer/clip_by_value/Minimum)), PWN(ESMM_FW_Gate_Network/gate_output_layer/clip_by_value/y:0 + ONNXTRT_Broadcast_21, PWN(ESMM_FW_Gate_Network/gate_output_layer/clip_by_value))), PWN(ESMM_FW_Gate_Network/gate_output_layer/Sigmoid)), PWN(ESMM_FW_Gate_Network/gate_output_layer/Mul/y:0 + ONNXTRT_Broadcast_23, PWN(ESMM_FW_Gate_Network/gate_output_layer/Mul)))", "timeMs" : 30.4434, "averageMs" : 0.00335279, "medianMs" : 0.003328, "percentage" : 1.45595 } , { "name" : "reshape_before_ESMM_FW_Gate_Network/MatMul", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_FW_Gate_Network/MatMul", "timeMs" : 405.694, "averageMs" : 0.04468, "medianMs" : 0.044768, "percentage" : 19.4023 } , { "name" : "reshape_after_ESMM_FW_Gate_Network/MatMul", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "PWN(PWN(ESMM_Dnn_Network/Mul_1), PWN(ESMM_Dnn_Network/Mul_3))", "timeMs" : 32.9282, "averageMs" : 0.00362645, "medianMs" : 0.003712, "percentage" : 1.57479 } , { "name" : "PWN(PWN(ESMM_Dnn_Network/Mul), PWN(ESMM_Dnn_Network/Mul_2))", "timeMs" : 34.0671, "averageMs" : 0.00375188, "medianMs" : 0.00384, "percentage" : 1.62926 } , { "name" : "reshape_before_ESMM_Dnn_Network/dnn_first_layer/MatMul_1", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "reshape_before_ESMM_Dnn_Network/dnn_first_layer/MatMul", "timeMs" : 0, "averageMs" : 0, "medianMs" : 0, "percentage" : 0 } , { "name" : "ESMM_Dnn_Network/dnn_first_layer/MatMul", "timeMs" : 478.948, "averageMs" : 0.0527476, "medianMs" : 0.052896, "percentage" : 22.9057 } , { "name" : "ESMM_Dnn_Network/dnn_first_layer/MatMul_1 + ESMM_Dnn_Network/dnn_first_layer/add", "timeMs" : 259.201, "averageMs" : 0.0285464, "medianMs" : 0.028672, "percentage" : 12.3963 } , { "name" : "ESMM_Dnn_Network/dnn_first_layer/BatchNorm/FusedBatchNormV3", "timeMs" : 36.2014, "averageMs" : 0.00398694, "medianMs" : 0.004, "percentage" : 1.73133 }
最新发布
08-04
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值