英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用
目录
2.3.3. Relationship With Multi-Task Learning
2.5.3. Comparing to Existing Methods
2.6.2. Different Strategies for Feature Calibration
2.6.3. Visualization of Feature Embedding
2.6.4. Rare Disease Diagnosis With Different K Values
2.6.5. Rare Disease Simulation and Data Limitation
1. 心得
(1)康康小样本
(2)这个图画的好数学/信号类,挺好看的,极繁主义者狂喜
(3)不评价创新性,不是这边直接小方向的
(4)一开始以为会很难上手?扫一眼属于公式很多+主图复杂的。但实际上写的非常清晰明了,感觉是稍微有矩阵/传输算法基础的就能get到意味的,哪怕不了解具体算法到底怎么实现但是还是能清楚知道作者用什么算法做了什么事儿,理由也看起来非常nice。而且图画的非常清楚
(5)是作者之前投MICCAI工作的延伸。读下来是觉得哪怕不看之前的工作也可以读得懂的行文
2. 论文逐段精读
2.1. Abstract
①Limitation: a) scarse data of rare disease; b) few-shot learning (FSL) only increases the performance on rare diseases but cannot performs well on both rare disease and common disease
②So they proposed the Disentangle then Calibrate with Gradient Guidance (DCGG) framework
2.2. Introduction
①Existing FSL methods: self-supervised learning (SSL), meta-learning, or metric-learning techniques
②⭐作者认为现有的三种办法会让模型对罕见病更敏感:比如SSL虽然在常见病上训练但是在罕见病上微调了;元学习也是学习了小样本的罕见病;指标学习通过比较指标来分类(我没有了解过这个诶,感觉很像原型学习,学习典型?)。然后可能会导致,比如在某个数据集里面,健康的人有1000个,常见病患者1000个,然后罕见病50个,模型结果是分类罕见病精度高但常见病就不高了?倒是一个很有意思的研究问题。能不能罕见病增强来copy paste成1000个啊哈哈哈哈哈哈哈但这样有点太假了
2.3. Related Work
2.3.1. FSL Techniques
①Model-driven methods, meta-learning and metric-learning, always rely on training strategy rather than data itself
②Data-driven methods, such as data augmentation, transfer feature from majority class to minority class
2.3.2. Rare Disease Diagnosis
①Related methods:
2.3.3. Relationship With Multi-Task Learning
①Different from multi-task learning (MTL), the authors focus on knowledge transfer and single task
2.4. Method
2.4.1. Overview
①The overall framework of DCGG:
②For dataset of image/label pairs
with medical image
and one-hot label
③Common-disease subset is represented by and rare-disease subset is denoted by
2.4.2. GND Module
①Training mini-batch of each common disease and obtaining the gradient of each one (cross entropy loss).
②Getting an average gradient for all the diseases:
③Mapping each disease to the average space:
where denotes the
-th channel of
. The channels which have higher value denote the consistency of diseases. Thus they define the highest
channels of
as the disease-shared channels
and the left are disease-specific channels
(好新奇的视角是因为我平时不太看这方面的论文吗?)
④Gradient for common-disease and rare-disease: and
, and they can be further decomposed to:
⑤作者想要进一步优化这两个梯度从而影响决策:⭐在共享通道上,作者想要只让罕见病检测效果好→降低罕见病检测的损失→但不影响常见病。此时,作为罕见病的共享通道需要被优化,并且找到的梯度
应该和
同一方向:
⑥The specific feature of each common disease: . ⭐在特异性通道上,优化罕见病不能让它受到常见病的影响,因此需要让它与常见病特异性通道正交:
both of two functions are solved by Gram-Schmidt and Karush-Kuhn-Tucker condition(没学过,但查了查好像就是之前学的线代知识的延伸)
2.4.3. GFC Module
①They model and
as a discrete uniform distribution over
common diseases and
rare diseases(所以说这种分布是自己定的?为什么不是高斯分布啥的):
②For and
, the feature of the
-th common disease and the
-th rare disease at the
-th disease-shared channel, the transfer from common disease to rare disease can be described by optimal transport (OT) ptoblem:
where is the transport plan that needs to be solved, and
is the cost matrix that points out the cost we should pay when linking a common disease to a rare one
③They measure the cost by Euclidean distance of gradients:
④Assuming that the common-disease feature in each channel follows a Gaussian distribution, they utilize Sinkhorn to solve the OT problem and update mean and standard variance by:
⑤The specific feature will be:
2.4.4. Summary
①The algorithm:
2.5. Experiments
2.5.1. Dataset
①Statistics of datasets:
2.5.2. Implementation
①Backbone : WideResNet with detail:
②Optimizer: Adam with 0.001 learning rate
③Input image size:
④Batch size:
⑤Cross validation: 5 fold for common disease training, 20% samples in training set were randomly selected to validation. Randomly selected rare disease samples for training and other 20% for validation. The remaining rare disease samples are test set. They executed 4 times non-repetitively sampling for each fold
⑥Epoch: 300
2.5.3. Comparing to Existing Methods
①Comparison table on 3 datasets:
2.5.4. Ablation Study
①Module ablation:
其中I是常见病训练而罕见病微调,II是只有GND,III是只有GFC
2.6. Discussion
2.6.1. Hyper-Parameters
①Ablation of shared channel :
2.6.2. Different Strategies for Feature Calibration
①Other transfer methods(不会是excel画的吧我的也长这样,绷):
2.6.3. Visualization of Feature Embedding
①t-SNE visualization:
2.6.4. Rare Disease Diagnosis With Different K Values
①Ablation of trained rare-disease samples:
2.6.5. Rare Disease Simulation and Data Limitation
①嗯~真的只用了很少的样本训练,并且可以在很多罕见病上测试。感觉很不错啊
2.6.6. Future Work
①Enhance the generalization
②0 shot senario
2.7. Conclusion
~