开坑：关于MATLAB 使用CNN进行图像分类时，训练输出的验证集精度会陡降的问题

原创已于 2022-09-28 20:13:51 修改 · 1.5k 阅读

4 ·

CC 4.0 BY-SA版权

文章标签：

#matlab #cnn #分类

于 2022-09-28 17:17:36 首次发布

深度学习同时被 2 个专栏收录

10 篇文章

订阅专栏

MATLAB

5 篇文章

订阅专栏

探讨在MATLAB中使用CNN进行图像分类时遇到的验证集精度陡降问题。分析了BatchNormalizationStatistics训练选项对模型精度的影响，特别是'population'模式导致的验证集精度变化。

开坑关于MATLAB 使用CNN进行图像分类时，训练输出的验证集精度会陡降的问题

问题描述
原因探究

问题描述

如图所示，蓝色线代表训练集精度，黑色圆点是验证集精度，在训练的时候二者精度都比较高，但是最后输出的模型（图中红框所示）是这样的，抖降到了84.8%，这个就很奇怪。

在这里插入图片描述

原因探究

对于这个问题我暂时还没找到原因，所以记录一下阶段性的研究结果。

MATLAB官方文档查阅

通过查阅MATLAB官方文档trainNetwork词条页面中的Output Argument - info词条，发现有如下描述：

For networks containing batch normalization layers, if the BatchNormalizationStatistics training option is ‘population’ then the final validation metrics are often different from the validation metrics evaluated during training. This is because batch normalization layers in the final network perform different operations than during training. For more information, see batchNormalizationLayer.

翻译过来就是：

对于就有BN层的网络，如果BatchNormalizationStatistics这个训练选项设置为’population’，那么最终验证集的度量时常会与训练时的验证度量评估不同。这是因为BN层在最终网络中与在训练时相比，进行了不同的操作。对于更多的信息，查看batchNormalizationLayer词条。

我对这句话的理解的大致意思就是这个训练选项影响了最终输出的模型验证集精度，但是具体怎么影响的没有具体说明。

我们下面查看MATLAB官方文档trainingOptions中Solver Options部分对于BatchNormalizationStatistics训练选项的描述

在这里插入图片描述
翻译过来：

评估BN层中统计数据的模式，指定为一下一种：
‘population’ - 使用population统计数据，在训练之后，软件通过再次pass through训练数据并使用结果均值和方差来最终确定统计数据。
‘moving’ - 使用更新步骤给出的运行估算，近似训练期间的统计数据，并且按照下式更新

到这里还是有点迷糊这两个模式有啥区别，我们再往下看，看MATLAB官方文档中对batchNormalizationLayer的说明，

在 TrainedMean 和 TrainedVariance 两个词条的介绍中有两段对于BatchNormalizationStatistics这个训练选项的说明

（TrainedMean 和 TrainedVariance 应该指的就是一个batch中每个通道的平均值和方差，注意：这两个不是可训练参数，可训练参数是scale 和 offset）

If the BatchNormalizationStatistics training option is ‘moving’, then the software approximates the batch normalization statistics during training using a running estimate and, after training, sets the TrainedMean and TrainedVariance properties to the latest values of the moving estimates of the mean and variance, respectively.

If the BatchNormalizationStatistics training option is ‘population’, then after network training finishes, the software passes through the data once more and sets the TrainedMean and TrainedVariance properties to the mean and variance computed from the entire training data set, respectively.

翻译过来