论文阅读《Boosting Few-Shot Learning With Adaptive Margin Loss》

最新推荐文章于 2024-08-05 21:52:42 发布

原创最新推荐文章于 2024-08-05 21:52:42 发布 · 790 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#boosting #机器学习 #自然语言处理

本文探讨了一种针对小样本学习任务的改进方法，通过结合词嵌入和自适应margin loss，如CRAML和TRAML，使得相似类别间的区分更加有效。研究者提出了一种新的AdaptiveMarginLoss，适用于标准FSL和generalized FSL场景，提升了特征提取的区分度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Background ＆ Motivation

各种度量学习方法的不同之处就在于特征的提取方法和嵌入空间内距离的度量方法不同。

交叉熵损失常用来监督模型提取区别度高的视觉特征，在此之前还有人提出各种不同的 margin loss。最简单的 Naive Additive Margin Loss：

这个方法是假设所有的类都应该平等的远离彼此，因此增加了一个固定的常数 m。但是对相似的类别并不能很好地区分，尤其是在小样本的设定下。

除此之外还有 angular margin 和 cosine margin 等等。

By observing that the weights from the last fully connected layer of a classification DCNN trained on the softmax loss bear conceptual similarities with the centers of each class.

没读太懂。

但是之前的 margin loss 并不适合小样本学习任务这种数据很稀缺的情况，这也是本文的 Motivation。

Methodology

第一次看到 standard FSL 和 genrealized FSL 这两种说法。

Standard FSL
where the test data contain novel class samples only.

Generalized FSL（更贴合实际
where the label space of test data covers both base and novel classes.

提出了一个 adaptive margin loss，旨在嵌入空间中更好的分离不同类的物体，特别是使相似的类尽可能分离地远，更适合小样本学习，示意图如下：

使用了一种之前没有看到过的方法，将类别的语义相似性（词嵌入 word embedding）加入到了提出的 naive additive margin loss 中。

Training strategy

模型在训练时采用 adaptive margin loss，测试时只用简单的 softmax 来完成分类。

Class-Relevant Additive Margin Loss（CRAML）

为了更适合小样本学习任务，在 naive additive margin loss 的基础上应该自适应地使不同的类别间的距离不相同。基于这个想法提出了 CRAML，引入了类别的语义特征（word embedding）来调整类别间距。构建了一个 class-relevant margin 产生器 M，输入类别对的名字来获得其 adaptive margin：