这篇文章用于介绍tf.keras.layers.Dropout
tf.keras.layers.Dropout,先看google给出的解释。
The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. Inputs not set to 0 are scaled up by 1/(1 - rate) such that the sum over all inputs is unchanged.
Note that the Dropout layer only applies when training is set to True such that no values are dropped during inference. When using model.fit, training will be appropriately set to True automatically, and in other contexts, you can set the kwarg explicitly to True when calling the layer.
(This is in contrast to setting trainable=False for a Dropout layer. trainable does not affect the layer’s behavior, as Dropout does not have any variables/weights that can be frozen during training.)
即,Dropout方法根据给定参数rate将元素随机置为0,余下的未置0的元素做放大处理,放大的比例为11−rate\frac{1}{1-rate}1−rate1。
其中,rate: [0, 1)。
这个方法用于防止过拟合现象。
下面看几组实验。
实验1
tf.random.set_seed(0)
layer = tf.keras.layers.Dropout(0.2, input_shape=(2,))
data = np.arange(1,21).reshape(5, 4).astype(np.float32)
print(data)
outputs = layer(data, training=True)
print(outputs)
输出:
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]
[13. 14. 15. 16.]
[17. 18. 19. 20.]]
tf.Tensor(
[[ 1.25 2.5 3.75 5. ]
[ 6.25 7.5 8.75 10. ]
[11.25 0. 13.75 15. ]
[16.25 17.5 18.75 20. ]
[21.25 22.5 23.75 0. ]], shape=(5, 4), dtype=float32)
实验二
tf.random.set_seed(0)
layer = tf.keras.layers.Dropout(0.5, input_shape=(2,))
data = np.arange(1,21).reshape(5, 4).astype(np.float32)
print(data)
outputs = layer(data, training=True)
print(outputs)
输出:
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]
[13. 14. 15. 16.]
[17. 18. 19. 20.]]
tf.Tensor(
[[ 0. 0. 6. 8.]
[ 0. 12. 0. 16.]
[18. 0. 22. 24.]
[26. 0. 0. 32.]
[34. 36. 38. 0.]], shape=(5, 4), dtype=float32)
实验三
tf.random.set_seed(0)
layer = tf.keras.layers.Dropout(0.99, input_shape=(2,))
data = np.arange(1,21).reshape(5, 4).astype(np.float32)
print(data)
outputs = layer(data, training=True)
print(outputs)
输出:
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]
[ 9. 10. 11. 12.]
[13. 14. 15. 16.]
[17. 18. 19. 20.]]
WARNING:tensorflow:Large dropout rate: 0.99 (>0.5). In TensorFlow 2.x, dropout() uses dropout rate instead of keep_prob. Please ensure that this is intended.
tf.Tensor(
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 800.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]], shape=(5, 4), dtype=float32)
在实验中,鄙人依次将rate设置为0.2、0.5、0.99,矩阵中的元素根据rate概率被置0,余下元素按照关系放大。