by Yangqing on 14 May 2014
For a sanity check, try running with a learning rate 0 to see if any nan errors pop up (they shouldn’t, since no learning takes place). If data is not initialized well, it might be possible that even 0.0001 is a too high learning rate.
by sguada on 13 May 2014
Try different initializations, for instance bias set to 0.1
在遇到训练数据初始化不当导致模型无法收敛至合理解时,文章建议尝试使用不同的初始化策略,如将偏置项设置为0.1,并在学习率设置为0的情况下检查是否存在NaN错误,以确定是否需要调整学习率的值。
1435

被折叠的 条评论
为什么被折叠?



