dropout in training and testing #5357
unrealwill commented on 11 Feb 2017
Hello, You can see that dropout is only applied in train phase. |
radekosmulski commented on 24 Feb 2017
That is correct - dropout should be applied during training (drop inputs with probability p) but there also needs to be a corresponding component of scaling the weights at test time as outlined in the referenced paper I guess this is not happening at the moment, at least the results I got thus far might indicate that there is an issue here. Will investigate this further and see if I can provide an example. |
unrealwill commented on 24 Feb 2017 •
Hello, @radekosmulski
https://github.com/fchollet/keras/blob/master/keras/layers/core.py#L110 |
radekosmulski commented on 24 Feb 2017 •
Thank you for your reply @unrealwill. I am new to keras so sorry if I misunderstand something. I still feel there is something unusual when running model.predict or model.evaluate when using dropout. Please see below: import keras
import numpy as np
X = np.array(
[[2, 1],
[4, 2]])
y = np.array(
[[5],
[10]]
)
# Works as expected without dropout
model = keras.models.Sequential()
model.add(keras.layers.Dense(input_dim=2, output_dim=1))
model.compile(keras.optimizers.SGD(), loss='MSE')
model.fit(X, y, nb_epoch=10000, verbose=0)
model.evaluate(X, y) # => ~0
# With dropout
model = keras.models.Sequential()
model.add(keras.layers.Dense(input_dim=2, output_dim=1))
model.add(keras.layers.Dropout(0.5))
model.compile(keras.optimizers.SGD(), loss='MSE')
model.fit(X, y, nb_epoch=10000, verbose=0)
model.evaluate(X, y) # => converges to MSE of 15.625
model.predict(X) # => array([[ 2.5],
# [ 5. ]], dtype=float32)
The MSE this converges to is due to the outputs being exactly half of what they should be (2.5^2+5^2)/2 = 15.625 |
unrealwill commented on 24 Feb 2017 •
@radekosmulski The bias of the dropout can be subsequently removed by using a dense layer after the first layer (=>average result = 7.5 ).
|
radekosmulski commented on 24 Feb 2017
@unrealwill thank you very much for taking the time to reply, I really appreciate it. I understand now. |
This was referenced on 27 Feb 2017
stale bot added the stale label on 26 May 2017
stale bot commented on 26 May 2017
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
stale bot closed this on 25 Jun 2017
spearsem commented on 8 Dec 2017
@unrealwill There is another use case of dropout at testing or inference time: in order to get a notion of uncertainty and variability in the prediction of the network model, you might take a given input and run Say you run In this sense, it would be very useful to have to ability to re-activate Dropout settings from training, but specifically during testing or regular inference. |
Dropout的意义
在学习时,随即的去掉某些特征,以此避免过拟合。
Dropout层源代码
dropout层在layer下的core.py中
class Dropout(Layer):
'''Applies Dropout to the input. Dropout consists in randomly setting
a fraction `p` of input units to 0 at each update during training time,
which helps prevent overfitting.
# Arguments
p: float between 0 and 1. Fraction of the input units to drop.
# References
- [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
'''
def __init__(self, p, **kwargs):
self.p = p
if 0. < self.p < 1.:
self.uses_learning_phase = True
self.supports_masking = True
super(Dropout, self).__init__(**kwargs)
def call(self, x, mask=None):
if 0. < self.p < 1.:
x = K.in_train_phase(K.dropout(x, level=self.p), x)
return x
def get_config(self):
config = {'p': self.p}
base_config = super(Dropout, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
分析
功能实现
其中的call方法调用了K.dropout来处理具体的事情。
根据前面的import得知,这个K就是backend。 keras的backend有两个,一个是theano,一个是tensorflow
theano的dropout
在backend的theano_backend.py中
def dropout(x, level, seed=None):
if level < 0. or level >= 1:
raise Exception('Dropout level must be in interval [0, 1[.')
if seed is None:
seed = np.random.randint(10e6)
rng = RandomStreams(seed=seed)
retain_prob = 1. - level
x *= rng.binomial(x.shape, p=retain_prob, dtype=x.dtype)
x /= retain_prob
return x
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
这个函数里使用了固定的随机数种子(10e6), 用户可以通过seed函数指定。
随即使用二项式分布的随机数种子生成了一串随机数。这些随机数是0或者1
他们服从二项分布。 乘以x后,x若干项被置为0. 然后将x除以(1-lever),使得所有的x保持加为1
这样就实现了Dropout
in_train_phase
这个函数的作用就是说, 当我训练的时候我采用dropout,当我测试的时候,我不采用。他里面调用了一个switch语句
def switch(condition, then_expression, else_expression):
'''condition: scalar tensor.
'''
return T.switch(condition, then_expression, else_expression)
def in_train_phase(x, alt):
x = T.switch(_LEARNING_PHASE, x, alt)
x._uses_learning_phase = True
return x
转处:https://blog.youkuaiyun.com/taiji1985/article/details/51251628
wenouyang commented on 11 Feb 2017
In this link, devinplatt gives the following way to include dropout in training,
In this post, author mentioned that “Finally, if the training has finished, you’d use the complete network for testing (or in other words, you set the dropout probability to 0).”
In terms of keras implementation, does that mean, we have to modify the line
model.add(Dropout(0.5, input_shape=(20,)))
after we loading the training weight.