文章目录
Adversarial Robustness - Theory and Practice
第一章 - Introduction to adversarial robustness
我运行Adversarial Robustness-Theory and Practice.introduction代码,加载resnet50,看到在注入噪音后,pig图像被算法误认为是airliner。
第二章 - linear models
(1) 加载MINIST数据集
(2) 对数据进行常规训练,TEST_ERR错误率仅仅0.04%.
(3) 开始进行对抗攻击,随机干扰数EPSILON =0.2
(4) 实行对抗攻击,发现TEST_ERR的错误率从之前的0.04%骤升到85%左右。
(5) 然后进行鲁棒训练,最核心的是MODEL(X.VIEW(X.SHAPE[0], -1))[:,0] - EPSILON*(2*Y.FLOAT()-1)*MODEL.WEIGHT.NORM(1)
(6) 鲁棒训练完成后,任何对抗攻击不会让TEST_ERROR高于2.5%。此时非对抗攻击得到的TEST_ERROR的错误率是0.3%左右(这个结果是大于之前的0.04%的)。这是鲁棒训练20个周期的结果,我测试了下,如果加大训练周期,并不会让结果更优。
也就是说进行鲁棒训练会提升抵抗对抗攻击的能力,但是同时会小幅度提升TEST_ERROR的比率。
第三章 - Adversarial examples, solving the inner maximization
1.非针对性攻击
主要方法是FGSM和PGD。PGD是迭代更新,比FGSM的迭代次数多。但是当梯度很小的时候,传统的PGD的效果也不好,于是出现the (normalized) steepest descent method.相对于传统PGD算法,它的delta.data = (delta + alpha*delta.grad.detach().sign()).clamp(-epsilon,epsilon)。这种改进的PGD的表现仍然受到目标内局部最优可能性的限制,虽然不可能完全避免局部最优,但可以通过随机重启来缓解这个问题。
2.针对性攻击(基于改进的PGD->the (normalized) steepest descent method)
最大化真实label的损失函数,并最小化目标label的损失函数,这相当于解决内部优化问题。下面是几种损失函数设计
(1)loss = (yp[:,y_targ] - yp.gather(1,y[:,None])[:,0]).sum()
缺点:仅仅让非零数字欺骗分类器。原因在于我们是the class logit for the zero minus the class logit for the true class. 但是我们实际上并不关心其他类的情况。所以我们可以修改损失函数为下面这种。
(2)loss = 2*yp[:,y_targ].sum() - yp.sum()
缺点:不能达到100%正确率。
(3) 占个位,这个不太懂。
3.组合优化解决内部最大问题
有一些寻找边界区间界限的方法,但是被轻微扰动后,区间界限上下浮动比较大,不实用。最终用的方法是混合整数线性规划策略。代码部分主要是利用cvxpy构建了很多constraints。
关于优化这部分内容不需要细看。如果细看的话,估计2年也看不完。大概知道做什么的就行了。
第四章 - Adversarial training, solving the outer minimization
1. 方案目标
The goal of the robust optimization formulation, therefore, is to ensure that the model cannot be attacked even if the adversary has full knowledge of the model.
In other words, no matter what attack an adversary uses, we want to have a model that performs well.
2. 可选择方案
2.1 local gradient-based search (providing a lower bound on the objective) 基于局部梯度的搜索
2.2 exact combinatorial optimization (exactly solving the objective) 精确的组合优化 (不实用)
2.3. convex relaxations (providing a provable upper bound on the objective) 凸松弛
但是经过分析,法2不实用,最终的可行方案是下面两个
2.1.Using lower bounds, and examples constructed via local search methods, to train an (empirically) adversarially robust classifier.
2.3Using convex upper bounds, to train a provably robust classifier.
3. 方案实施
The basic idea is to simply create and then incorporate adversarial examples into the training process
the question arises as to which adversarial examples we should train on?
4. 代码
4.1 加载minist数据集
4.2 初始化model_cnn
4.3 定义fgsm、pgd函数
4.4 定义标准训练函数、对抗攻击函数
4.5 进行联合训练(基于cnn)
opt = optim.SGD(model_cnn.parameters(), lr=1e-1)
for t in range(10):
train_err, train_loss = epoch(train_loader, model_cnn, opt)
test_err, test_loss = epoch(test_loader, model_cnn)
adv_err, adv_loss = epoch_adversarial(test_loader, model_cnn, pgd_linf)
if t == 4:
for param_group in opt.param_groups:
param_group["lr"] = 1e-2
print(*("{:.6f}".format(i) for i in (train_err, test_err, adv_err)), sep="\t")
torch.save(model_cnn.state_dict(), "model_cnn.pt")
So as we saw before, the clean error is quite low, but the adversarial error is quite high (and actually goes up as we train the model more). Let’s now do the same thing, but with adversarial training.
4.6 做一些happy的事情
opt = optim.SGD(model_cnn_robust.parameters(), lr=1e-1)
for t in range(10):
train_err, train_loss = epoch_adversarial(train_loader, model_cnn_robust, pgd_linf, opt)
test_err, test_loss = epoch(test_loader, model_cnn_robust)
adv_err, adv_loss = epoch_adversarial(test_loader, model_cnn_robust, pgd_linf)
if t == 4:
for param_group in opt.param_groups:
param_group["lr"] = 1e-2
print(*("{:.6f}".format(i) for i in (train_err, test_err, adv_err)), sep="\t")
torch.save(model_cnn_robust.state_dict(), "model_cnn_robust.pt")
pretty good!
4.7 对比两个不同的cnn
model_cnn_robust = nn.Sequential(nn.Conv2d(1, 32, 3, padding=1), nn.ReLU(),
nn.Conv2d(32, 32, 3, padding=1, stride=2), nn.ReLU(),
nn.Conv2d(32, 64, 3, padding=