关于CNN中dropout的一些要点

最新推荐文章于 2025-05-19 20:46:53 发布

SauryGo

最新推荐文章于 2025-05-19 20:46:53 发布

阅读量4.8k

点赞数 2

CC 4.0 BY-SA版权

分类专栏： deep learning 文章标签： dropout

本文链接：https://blog.youkuaiyun.com/sean2100/article/details/83782780

deep learning 专栏收录该内容

23 篇文章

订阅专栏

Dropout层通过随机关闭部分神经元，避免网络依赖特定组合的特征，增强模型的泛化能力。训练时，根据保留概率选择节点更新，使网络学习更通用的特征。Dropout类似L2正则化，但适用范围更广，常用于计算机视觉领域。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

dropout层的作用是防止训练的时候过拟合。在训练的时候，传统的训练方法是每次迭代经过某一层时，将所有的结点拿来做参与更新，训练整个网络。加入dropout层，我们只需要按一定的概率（retaining probability）p 来对weight layer 的参数进行随机采样，将被采样的结点拿来参与更新，将这个子网络作为此次更新的目标网络。这样做的好处是，由于随机的让一些节点不工作了，因此可以避免某些特征只在固定组合下才生效，有意识地让网络去学习一些普遍的共性（而不是某些训练样本的一些特性）这样能提高训练出的模型的鲁棒性。

Dropout只发生在模型的训练阶段，预测、测试阶段则不用Dropout
直观认识：Dropout随机删除神经元后，网络变得更小，训练阶段也会提速
事实证明，dropout已经被正式地作为一种正则化的替代形式
有了dropout，网络不会为任何一个特征加上很高的权重（因为那个特征的输入神经元有可能被随机删除），最终dropout产生了收缩权重平方范数的效果
Dropout的功能类似于L2正则化，但Dropout更适用于不同的输入范围
如果你担心某些层比其它层更容易过拟合，可以把这些层的keep-prob值设置的比其它层更低
Dropout主要用在计算机视觉领域，因为这个领域我们通常没有足够的数据，容易过拟合。但在其它领域用的比较少
Dropout的一大缺点就是代价函数不再被明确定义，所以在训练过程中，代价函数的值并不是单调递减的

tensorflow: what's the difference between tf.nn.dropout and tf.layers.dropout

I'm quite confused about whether to use tf.nn.dropout or tf.layers.dropout.

many MNIST CNN examples seems to use tf.nn.droput, with keep_prop as one of params.

but how is it different with tf.layers.dropout? is the "rate" params in tf.layers.dropout similar to tf.nn.dropout?

Or generally speaking, is the difference between tf.nn.dropout and tf.layers.dropout applies to all other similar situations, like similar functions in tf.nn and tf.layers.

The only differences in the two functions are:

The tf.nn.dropout has parameter keep_prob: "Probability that each element is kept"
tf.layers.dropout has parameter rate: "The dropout rate"
Thus, keep_prob = 1 - rate as defined here
The tf.layers.dropout has training parameter: "Whether to return the output in training mode (apply dropout) or in inference mode (return the input untouched)."