SWA(Stochastic Weight Averaging)实验

最新推荐文章于 2023-07-29 11:52:59 发布

AI大魔王

最新推荐文章于 2023-07-29 11:52:59 发布

阅读量7.2k

点赞数 3

分类专栏： AI

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/Carlsummer/article/details/119673961

版权

AI 专栏收录该内容

20 篇文章

订阅专栏

有论文说swa能涨分，那么我来实验一下

那么我将在cifar10数据集上进行实验

原理

论文地址：https://arxiv.org/pdf/2012.12645.pdf

在这里插入图片描述

SGD倾向于收敛到loss的平稳区域平稳区域的大部分都处于边界，由于权重空间的维度比较高，SGD通常只会走到这些平稳区域的边界；SWA通过平均多个SGD的权重参数，使其能够达到平稳区域的中心.

在这里插入图片描述

为什么要去中心化呢？trainloss 最小的时候test error并不是最小，那么也就是泛化能力最好的地方不在train loss最小的地方。所以采用这个方法可以收敛到一个wide minima，这个wide minima有更好的泛化性；

在一个loss landscape上做visualization，优化分析，让他收敛到一个更好的local minima(极小值)

如何能收敛到这个泛化性更好的地方呢，那么采用SWA（类似优化器）能找到更好的local minima，故而泛化能力更好

wide minima数据集拟合极小值，类似定义的函数求各个参数拟合的极小值，而这个值不一定是loss最小的地方。因为loss不一定能反应拟合的曲线最小的函数。所以有时候需要更换loss

实验

模型采用：

shufflenet_v2_x0_5

数据集是cifar10

首先是没有用SWA

https://github.com/carlsummer/python_developer_tools/blob/main/python_developer_tools/cv/classes/demo/train_cifar10.py

然后是用swa

https://github.com/carlsummer/python_developer_tools/blob/main/python_developer_tools/cv/train/%E4%BA%8C%E9%98%B6%E6%AE%B5%E8%AE%AD%E7%BB%83/swa_pytorch.py

实验结果表明：

用之前41%用之后69%

推荐库：https://github.com/carlsummer/python_developer_tools

参考

https://pytorch.org/docs/master/optim.html?highlight=swa_utils

博客等级

码龄11年

46
原创

39
点赞

183
收藏

16
粉丝

关注

私信

热门文章

分类专栏

MixMatch论文解读 1篇
半监督学习 1篇
window 2篇
脚本 1篇
生活
Frontend 8篇
JAVA 5篇
MySQL 1篇
Linux 4篇
Photoshop 2篇
外快成品
python 2篇
AI 20篇
风水

最新评论

pytorch scheduler汇总
Worldrebuild: 清晰易懂！
CenterNet2训练自己的数据集
monster000w: ValueError: Format 'jpg ' is not supported (supported formats: eps, jpeg, jpg, pdf, pgf, png, ps, raw, rgba, svg, svgz, tif, tiff, webp) 报这个错就离谱？
CenterNet2训练自己的数据集
liuliuheshiji: AssertionError: Attribute 'thing_classes' in the metadata of 'coco_2017_train' cannot be set to a different value! ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'] != ['UAV']请问博主，这种怎么解决呢？
udp,ftp,smb,socket传输速度对比
SDINI_: 实测 SFTP速度>SCP速度
CenterNet2训练自己的数据集
weixin_46401150: 你好，怎么解决的啊

大家在看

最新文章

目录

评论 1

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。