1,batch normalization是以特征为主体进行标准化,一个batch中所有样本的某个特征组成一组数,对这组数进行标准化。

2,layer normalization是以样本为主体进行标准化,某个样本的所有特征组成一组数,对这组数进行标准化。

3,标准化最常用的方法就是减去平均值,再除以标准差。

4,标准化的目的:1),加快训练的速度;2),防止梯度爆炸。
batch normalization常用在CNN上,而用layer normalization用在RNN和transformer上更合适,因为序列数据的长度不一,导致有些特征在部分样本中没有,给基于特征的标准化带来了麻烦。
5,batch normalization的缺点:
1),In batch normalization, we use the batch statistics: the mean and standard deviation corresponding to the current mini-batch. However, when the batch size is small, the sample mean and sample standard deviation are not representative enough of the actual distribution and the network cannot learn anything meaningful.
2),As batch normalization depends on batch statistics for normalization, it is less suited for sequence models. This is because, in sequence models, we may have sequences of potentially different lengths and smaller batch sizes corresponding to longer sequences.
Reference
https://www.pinecone.io/learn/batch-layer-normalization/
批归一化是基于特征进行标准化,常用于CNN,而层归一化适用于RNN和Transformer,因为它处理序列数据的长度变化。标准化有助于加速训练并防止梯度爆炸。然而,批归一化在小批量时可能不准确,且不适于序列模型,因为不同长度的序列可能导致不同的批次统计。
1767

被折叠的 条评论
为什么被折叠?



