Jo-SRC: A Contrastive Approach for Combating Noisy Labels

Jo-SRC:一种对抗噪声标签的对比方法

Abstract

Due to the memorization effect in Deep Neural Networks (DNNs), training with noisy labels usually results in inferior model performance. 由于深度神经网络 (DNN) 的记忆效应,使用嘈杂标签进行训练通常会导致模型性能较差。

Existing state-of-the-art methods primarily adopt a sample selection strategy, which selects small-loss samples for subsequent training.现有最先进的方法主要采用样本选择策略,该策略选择小损失样本进行后续训练。

However, prior literature tends to perform sample selection within each mini-batch, neglecting the imbalance of noise ratios in different mini-batches. 然而,先前的文献倾向于在每个小批量内进行样本选择,而忽略了不同小批量中噪声比的不平衡。

Moreover, valuable knowledge within high-loss samples is wasted. 此外,浪费了高损失样本中的宝贵知识。

To this end, we propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency).为此,我们提出了一种名为 Jo-SRC(基于一致性的联合样本选择模型正则化)的噪声鲁棒方法。

Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its “likelihood” of being clean or out-of-distribution. 具体来说,我们以对比学习的方式训练网络。 来自每个样本的两个不同视图的预测用于估计其干净或分布外的“可能性”。

Furthermore, we propose a joint loss to advance the model generalization performance by introducing consistency regularization. 此外,我们提出了一种联合损失,通过引入一致性正则化来提高模型泛化性能。

Extensive experiments have validated the superiority of our approach over existing state-of-the-art methods. 大量实验证实了我们的方法优于现有的最先进方法。

1. Introduction

DNNs have recently lead to tremendous progress in various computer vision tasks [14, 28, 42, 25, 40, 21]. These successes largely attribute to large-scale datasets with reliable annotations (e.g., ImageNet [4]). DNN 最近在各种计算机视觉任务中取得了巨大进步 [14、28、42、25、40、21]。 这些成功很大程度上归功于具有可靠注释的大规模数据集(例如,ImageNet [4])。

However, collecting well-annotated datasets is extremely labor-intensive and time-consuming, especially in domains where expert knowledge is required (e.g., fine-grained categorization [37, 36]). 然而,收集标注良好的数据集非常耗费人力和时间,尤其是在需要专家知识的领域(例如,细粒度分类 [37, 36])。

The high cost of acquiring large-scale well-labeled data poses a bottleneck in employing DNNs in real-world scenarios.获取大规模标记良好的数据的高成本是在实际场景中使用 DNN 的瓶颈。

As an alternative, emplo

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值