【general】[drop out]论文笔记:Dropout Reduces Underfitting

Dropout Reduces Underfitting

单位:Meta AI, UC伯克利, MBZUAI

论文:https://arxiv.org/abs/2303.01500

代码(刚刚开源):

https://github.com/facebookresearch/dropout

日期: 2023年3月2日提交

当前阅读日期: 2023-03-08

论文主要贡献:

论文提供了两种改变标准dropout的方法:early dropout 和 late dropout 提高了过拟合与欠拟合模型的模型效果。

具体为:

  • early dropout :在训练的开始一段时间进行dropout 在以后的训练中不使用,用于欠拟合模型更好的拟合
  • late dropout: 在训练的一开始不使用dropout ,在之后的训练使用dropout,来降低已经使用dropout模型的过拟合。

论文代码

early drop 与 late drop 的实现

# https://github.com/facebookresearch/dropout/blob/main/drop_scheduler.py
import numpy as np

def drop_scheduler(drop_rate, epochs, niter_per_ep, cutoff_epoch=0, mode="standard", schedule="constant"):
	'''
       drop_rate: drop_rate 
       epochs: epochs
       niter_per_ep:num_training_steps_per_epoch
       cutoff_epoch: 当mode 为 early 时表示从哪个epoch drop out 结束使用
										 当mode 为 late 时表示从哪个epoch drop out 开始使用
	     mode:			["standard", "early", "late"]
       schedule: drop out 是线性的还是固定的["constant", "linear"]

			 return : np.array length = epochs * niter_per_ep
													每一batch 的 dropout
	'''
    assert mode in ["standard", "early", "late"]
    if mode == "standard":
        return np.full(epochs * niter_per_ep, drop_rate)

    early_iters = cutoff_epoch * niter_per_ep
    late_iters = (epochs - cutoff_epoch) * niter_per_ep

    if mode == "early":
        assert schedule in ["constant", "linear"]
        if schedule == 'constant':
            early_schedule = np.full(early_iters, drop_rate)
        elif schedule == 'linear':
            early_schedule = np.linspace(drop_rate, 0, early_iters)
        final_schedule = np.concatenate((early_schedule, np.full(late_iters, 0)))

    elif mode == "late":
        assert schedule in ["constant"]
        early_schedule = np.full(early_iters, 0)
        final_schedule = np.concatenate((early_schedule, np.full(late_iters, drop_rate)))

    assert len(final_schedule) == epochs * niter_per_ep
    return final_schedule

early drop 与late drop 的使用

# 在声明模型时增加update_dropout 方法 如:
# https://github.com/facebookresearch/dropout/blob/main/models/vision_transformer.py
# ...
def update_dropout(self, drop_rate):
        self.drop_rate = drop_rate
        for module in self.modules():
            if isinstance(module, nn.Dropout):
                module.p = drop_rate
# 使用drop_path (关于drop  path :stochastic depth具体可见:https://github.com/huggingface/pytorch-image-models/blob/4b8cfa6c0a355a9b3cb2a77298b240213fb3b921/timm/layers/drop.py#L137
# https://github.com/facebookresearch/dropout/blob/main/models/vision_transformer.py
def update_drop_path(self, drop_path_rate):
        self.drop_path = drop_path_rate
        dp_rates=[x.item() for x in torch.linspace(0, drop_path_rate, self.depth)]
        for i in range(self.depth):
            self.blocks[i].drop_path.drop_prob = dp_rates[i]

# 在训练时每个batch 更新一下 dropout
# https://github.com/facebookresearch/dropout/blob/main/engine.py#L114
model.module.update_dropout(schedules['do'][it])

论文效果

   详见论文与git
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值