深度学习的Xavier初始化方法

最新推荐文章于 2025-07-10 19:13:58 发布

原创最新推荐文章于 2025-07-10 19:13:58 发布 · 336 阅读

CC 4.0 BY-SA版权

本文深入解析了TensorFlow中的variance_scaling_initializer函数，该函数基于Xavier初始化方法，旨在保持深度网络训练过程中输入变量变化尺度的稳定，防止梯度消失或爆炸。介绍了其工作原理，包括如何根据不同模式调整权重矩阵的方差，以及推荐的相关文献。

部署运行你感兴趣的模型镜像

在tensorflow中，有一个初始化函数：tf.contrib.layers.variance_scaling_initializer。Tensorflow 官网的介绍为：

variance_scaling_initializer(
factor=2.0,
mode='FAN_IN',
uniform=False,
seed=None,
dtype=tf.float32
)
1
2
3
4
5
6
7
Returns an initializer that generates tensors without scaling variance.

When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. This initializer use the following formula:

if mode='FAN_IN': # Count only number of input connections.
n = fan_in
elif mode='FAN_OUT': # Count only number of output connections.
n = fan_out
elif mode='FAN_AVG': # Average number of inputs and output connections.
n = (fan_in + fan_out)/2.0

truncated_normal(shape, 0.0, stddev=sqrt(factor / n))
1
2
3
4
5
6
7
8
这段话可以理解为，通过使用这种初始化方法，我们能够保证输入变量的变化尺度不变，从而避免变化尺度在最后一层网络中爆炸或者弥散。

这个方法就是 Xavier 初始化方法，可以从以下这两篇论文去了解这个方法：

·X. Glorot and Y. Bengio. Understanding the difficulty of training deepfeedforward neural networks. In International Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S.Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast featureembedding. arXiv:1408.5093, 2014.
或者可以通过这些文章去了解：

CNN数值
三种权重的初始化方法
深度学习——Xavier初始化方法
---------------------
作者：路虽远在路上
来源：优快云
原文：https://blog.youkuaiyun.com/u010185894/article/details/71104387
版权声明：本文为博主原创文章，转载请附上博文链接！

您可能感兴趣的与本文相关的镜像

TensorFlow-v2.15

TensorFlow

TensorFlow 是由Google Brain 团队开发的开源机器学习框架,广泛应用于深度学习研究和生产环境。它提供了一个灵活的平台,用于构建和训练各种机器学习模型