Bag of Tricks for Image Classification with Convolutional Neural Networks

该文探讨了2019年CVPR上关于卷积神经网络(CNN)图像分类的优化技巧,包括训练策略、初始化方法、批量大小优化、低精度计算和模型调整。文章指出,这些技巧不仅适用于图像分类,还能迁移到检测和分割任务。实验基于ResNet-50,通过调整如随机采样、数据增强、初始化权重、学习率调度等方法,提高了模型性能。文中还介绍了使用FP16进行低精度训练以加速并保持精度,以及通过特定的网络结构调整如ResNet-D,进一步提升准确性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

论文链接:Bag of Tricks for Image Classification with Convolutional Neural Networks


针对cnn图像分类的一些tricks
迁移到其他任务比如检测语义分割上也有很好的效果
2019 cvpr

  • baseline 的选择
  • 在新的硬件上有效训练tricks
  • 对ResNet-50进行微调
  • 其他的一些训练tricks
  • 其他应用上的迁移

baseline training procedure

根据http://torch.ch/blog/2016/02/04/resnets.html中优化的ResNet作为baseline。

  • 训练

    1. 随机采样图片,decoder为float-32bit,[0,255]
    2. 随机裁剪矩形区域,比例[3/4,4/3],面积随机采样[8%,100%]。之后resize到224*224
    3. 0.5的概率水平翻转
    4. [0.6,1.4]色调,饱和度,亮度,uniform系数
    5. 增加PCA噪音,正态分布 N ( 0 , 0.1 ) \mathcal{N}(0,0.1) N(0,0.1)
    6. 归一化rgb通道, − [ 133.68 , 116.779 , 103.939 ] -[133.68,116.779,103.939] [133.68,116.779,103.939]&& ÷ [ 58 , 393 , 57.12 , 57.375 ] \div[58,393,57.12,57.375] ÷[58,393,57.12,57.375]
  • 验证

    1. 保持图片长宽比,resize最短边为256
    2. crop中心区域224*224
    3. normalize RGB channels similar to training.
  • 初始化

    1. 卷积全连接 weights :Xavier初始化 参数值 from [ a , − a ] [a,-a] [a,a], a = 6 / ( d i n + d o u t ) a=\sqrt{6/(d_{in}+d_{out})}
Deep person re-identification is the task of recognizing a person across different camera views in a surveillance system. It is a challenging problem due to variations in lighting, pose, and occlusion. To address this problem, researchers have proposed various deep learning models that can learn discriminative features for person re-identification. However, achieving state-of-the-art performance often requires carefully designed training strategies and model architectures. One approach to improving the performance of deep person re-identification is to use a "bag of tricks" consisting of various techniques that have been shown to be effective in other computer vision tasks. These techniques include data augmentation, label smoothing, mixup, warm-up learning rates, and more. By combining these techniques, researchers have been able to achieve significant improvements in re-identification accuracy. In addition to using a bag of tricks, it is also important to establish a strong baseline for deep person re-identification. A strong baseline provides a foundation for future research and enables fair comparisons between different methods. A typical baseline for re-identification consists of a deep convolutional neural network (CNN) trained on a large-scale dataset such as Market-1501 or DukeMTMC-reID. The baseline should also include appropriate data preprocessing, such as resizing and normalization, and evaluation metrics, such as mean average precision (mAP) and cumulative matching characteristic (CMC) curves. Overall, combining a bag of tricks with a strong baseline can lead to significant improvements in deep person re-identification performance. This can have important practical applications in surveillance systems, where accurate person recognition is essential for ensuring public safety.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值