在用BatchNorm层的时候有use_global_stats和moving_average_fraction不知道什么时候用,特别是在fine-tune的时候,因此把BatchNorm层的代码给解读一下。
————————————————————–caffe.proto——————————————————————–
首先来看BatchNorm有哪些参数。
message BatchNormParameter {
// If false, accumulate global mean/variance values via a moving average. If
// true, use those accumulated values instead of computing mean/variance
// across the batch.
optional bool use_global_stats = 1;
// How much does the moving average decay each iteration?
optional float moving_average_fraction = 2 [default = .999];
// Small value to add to the variance estimate so that we don't divide by
// zero.
optional float eps = 3 [default = 1e-5];
}
- use_global_stats :如果设置为true的话,就用累加的均值和方差而不重新计算;如果为false的话,则用moving average来计算均值与方差。
- moving_average_fraction:计算均值和方差时的比例,也就是动量。
- note:在训练的时候要将use_global_stats 设为false,测试的时候默认为true。
——————————————————-BatchNormLayer::LayerSetUp—————————————————
template <typename Dtype>
void BatchNormLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
//获取这一层的参数
BatchNormParameter param = this->layer_param_.batch_norm_param();
//获取moving_average_fraction
moving_average_fraction_ = param.moving_average_fraction();
//如果为测试,use_global_stats_ 为true,因此训练时为false
use_global_stats_ = this->phase_ == TEST;
if (param.has_use_global_stats())
use_global_stats_ = param.use_global_stats();
if (bottom[0]->num_axes() == 1)
channels_ = 1;
else
channels_ = bottom[0]->shape(1);//默认在channel上归一化
eps_ = param.eps();
//初始化参数
if (this->blobs_.size() > 0) {
LOG(INFO) << "Skipping parameter initialization";
}