【Pytorch学习笔记】4.深度学习策略


根据龙良曲Pytorch学习视频整理,视频链接:
【计算机-AI】PyTorch学这个就够了!
(好课推荐)深度学习与PyTorch入门实战——主讲人龙良曲

25. Visdom可视化

  • Tensorboard: gogle出品的tensorflow可视化工具
  • TensorboardX: 对应的pytorch可视化工具
    数据必须搬到cpu上转化为numpy数据才能可视化
  • Visdom: from Facebook 效率高、美观

visdom安装

  • 不建议直接使用pip安装,会出现error
  • 推荐使用源码安装visdom

源码安装步骤

  1. 安装包地址:fossasia/visdom
  2. 解压后cmd进入visdom-master目录执行 pip install -e .
  3. 安装完成要将visdom-master文件下py目录下的文件夹复制到依赖包地址
  4. cmd执行命令python -m visdom.server打开服务连接,找到网址复制到浏览器
  5. 如果执行命令时Checking for scripts加载时间太慢,找出visdom的安装文件注释掉server.py中的download_scripts()
  6. 如果运行程序时visdom浏览器显示蓝屏,说明visdom文件被qiang了,需要再重新手动覆盖
    在这里插入图片描述

可视化代码

from visdom import Visdom

# lines: single trace
viz = Visdom()
viz.line([0.], [0.], win='train_loss', opts=dict(title='train_loss'))
viz.line([loss.item()], [global_step], win='train_loss', update='append')

# lines: multi-traces
viz.line([[0., 0.]], [0.], win='test', opts=dict(title='test loss&acc.', legend=['loss', 'acc.']))
viz.line([[test_loss, correct / len(test_loader.dataset)]], [global_step], win='test', update='append')

# visual X
viz.images(data.view(-1, 1, 28, 28), win='x')
viz.text(str(pred.detach().cpu().numpy()), win='pred', opts=dict(title='pred'))

可视化结果(没有手写数字的数据集???
在这里插入图片描述

26.过拟合&欠拟合

  • underfitting: Estimated < Ground-truth
    e.g. WGAN
    train acc. is bad
    test acc. is bad as well
  • overfitting: Estimated > Ground-truth
    how to detect
    how to reduce

27.Train-Val-Test划分

不能根据Test set的performance调整参数,只能通过Validation Set的结果调整,否则会造成数据污染

k-fold cross validation

  • merge train/val sets
  • randomly sample 1/k as val set
  • 尽可能多的数据实行backward调参,且防止死记硬背
# load data
train_db = datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.01307, ), (0.3081, ))
                   ]))
train_loader = torch.utils.data.DataLoader(
    train_db,
    batch_size=batch_size, shuffle=True
)

test_db = datasets.MNIST('../data', train=False,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.01307, ), (0.3081, ))
                   ]))
test_loader = torch.utils.data.DataLoader(
    test_db,
    batch_size=batch_size, shuffle=True
)

print('train', len(train_db), 'test', len(test_db))
train_db, val_db = torch.utils.data.random_split(train_db, [50000, 10000])
print('db1', len(train_db), 'db2', len(val_db))
train_loader = torch.utils.data.DataLoader(
    train_db,
    batch_size=batch_size, shuffle=True
)
val_loader = torch.utils.data.DataLoader(
    val_db,
    batch_size=batch_size, shuffle=True
)

28.正则化

Occam’s Razor

  • More thing should not be used than are necessary

Reduce Overfitting

  • More data
  • Constraint model complexity
    shallow
    regularization
  • Dropout
  • Data argumentation
  • Early Stopping

Regularization(weight decay):Loss后加入正则化项

  • L1-regularization λ ∑ i = 1 n ∣ θ i ∣ \lambda \sum^n_{i=1}|\theta _i| λi=1nθi
  • L2-regularization 1 2 λ ∣ ∣ W ∣ ∣ 2 \frac{1}{2}\lambda ||W||^2 21λW2

L2正则化

optimizer = optim.SGD(net.parameters(), lr=learning_rate, weight_decay=0.01)

L1正则化(只能手动实现)

regularization_loss = 0
for param in model.parameters():
    regularization_loss += torch.sum(torch.abs(param))
    
classify_loss = criteon(logits, target)
loss = classify_loss + 0.01 * regularization_loss

torch.optim.optimizer.zero_grad()
loss.backward()
optimizer.step()

29.动量与学习率衰减

momentum

  • 考虑上一次历史数据
  • 原梯度更新函数: w k + 1 = w k − α ▽ f ( w k ) w^{k+1}=w^k-\alpha\bigtriangledown f(w^k) wk+1=wkαf(wk)
  • 加入动量后: w k + 1 = w k − α ▽ f ( w k ) − β z k w^{k+1}=w^k-\alpha\bigtriangledown f(w^k)-\beta z^k wk+1=wkαf(wk)βzk
    w k + 1 = w k − α z k + 1 w^{k+1}=w^k-\alpha z^{k+1} wk+1=wkαzk+1,其中 z k + 1 = β z k + α ▽ f ( w k ) z^{k+1}=\beta z^k+\alpha\bigtriangledown f(w^k) zk+1=βzk+αf(wk)
  • pytorch的Adam优化器内置动量
optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=0.78, weight_decay=0.01)

learning-rate decay
在这里插入图片描述

在这里插入图片描述

# Assuming optimizer uses lr = 0.05 for all groups
# lr = 0.05       if epoch < 30
# lr = 0.005      if 30 <= epoch < 60
# lr = 0.0005     if 60 <= epoch < 90
# ...`在这里插入代码片`
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
for epoch in range(100):
    scheduler.step()
    train(...)
    validata(...)

30.early-stopping, dropout, sgd

early-stopping

  • Validation set to select parameters
  • Monitor validation performance
  • Stop at the highest val pref.

dropout

  • Learning less to learn better
  • Each connection has p=[0, 1] to lose
  • test时不dropout
  • torch.nn.Dropout(p=dropout_prob)
    tensorflow.nn.drropout(keep_prob)
net_dropped = nn.Sequential(
    nn.Linear(784, 200),
    nn.Dropout(0.5),
    nn.LeakyReLU(inplace=True),
    nn.Linear(200, 200),
    nn.Dropout(0.5),
    nn.LeakyReLU(inplace=True),
    nn.Linear(200, 10),
    nn.LeakyReLU(inplace=True),
)

for epoch in range(epochs):
	# train
	net_dropped.train()
	for batch_idx, (data, target) in enumerate(train_loader):
		...
	net_dropped.eval()
	test_loss = 0
	correct = 0
	for data, target in test_loader:
		...

Stochastic Gradient Descent(sgd)

  • Stochastic: not random 不是一次加载所有数据而是batch
  • Deterministic

31.贝叶斯定理

P ( A ∣ B ) = P ( B ∣ A ) × P ( A ) P ( B ) P(A|B)=\frac{P(B|A)×P(A)}{P(B)} P(AB)=P(B)P(BA)×P(A)

找了全网没有这一节的教程:(

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值