很多CVPR的论文都是用SGD或者SGD+Momentum方法优化网络,而不是理论上最吊的Adam,转载博客: https://blog.youkuaiyun.com/u014381600/article/details/72867109