《SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization》

Trizzz

于 2020-02-21 16:13:38 发布

阅读量810

点赞数

分类专栏：论文阅读记录

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/weixin_46040552/article/details/104429696

版权

论文阅读记录专栏收录该内容

13 篇文章

订阅专栏

Motivation

1、Convolutional neural networks typically encode an input image into a series of intermediate features with decreasing resolutions.

2、the motivation behind this scale-decreased architecture design: “High resolution may be needed to detect the presence of a feature, while its exact position need not to be determined with equally high precision.”

3、Intuitively, a scale-decreased backbone throws away the spatial information by down-sampling, making it challenging to recover by a decoder network.

4、While this structure is suited to classification tasks, it does not perform well for tasks requiring simultaneous recognition and localization (e.g., object detection).

Proposal

We propose SpineNet, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search.

Improvements

1、First, the scales of intermediate feature maps should be able to increase or decrease anytime so that the model can retain spatial information as it grows deeper.

2、Second, the connections between feature maps should be able to go across feature scales to facilitate multi-scale feature fusion.

Search Space

本文的搜索空间分为三个部分：

1、Scale permutations(dicide the ordering of blocks): permuting intermediate and output blocks respectively, resulting in a search space size of (N − 5)!5!.

2、Cross-scale connections(dicide the inputs for each block): The parent blocks can be any block with a lower ordering or block from the stem network. . The search space has a size of

3、Block adjustments: allow block to adjust its scale level and type.

思路要点

emmmmm，感觉可以借鉴的地方，就是这个模型的“自由度”了。
这篇论文所建模型的自由度非常高，但是带来的模型搜索复杂度也很高。
这篇论文没有写这个模型训练需要多久，说明时间不会太短。
嘛，不过这个思路确实是理想中NAS该有的样子了。

博客等级

码龄5年

27
原创

5
点赞

25
收藏

7
粉丝

关注

私信

热门文章

分类专栏

展开全部收起

上一篇：: 《Multinomial Distribution Learning for Effective Neural Architecture Search》

下一篇：: 《Attention Is All You Need》

最新评论

cityscapes.py阅读笔记
无处不乐zhc: # 将属于valid_classes的像素类别转成class_map对应的元素值。 for _validc in self.valid_classes: mask[mask == _validc] = self.class_map[_validc] 这一步的作用是什么
FCN论文阅读记录
Riser. deep: ‘’这种转化使得任意大小的图像都可以作为输入图像‘’是因为利用卷积代替了传统网络最后的全连接层，因为卷积层是一个一定大小的卷积核在一张输入特征图上滑动，所以输入输出都是任意的（但是输入输出之间是绝对关联的），但全连接层是将一个向量转化为另一个向量，由于其参数量（对应吴恩达课程中的w,b）是固定的，所以输入输出是固定的，全卷积网络将最后几个全连接层改成卷积层，实际上就是将全连接层的参数聚集成一个卷积核，能实现原来相同的功能，这个我不记得是在吴恩达深度学习经典网络课程中还是Googlenet的论文中有提及，博主可以参考。

最新文章

目录

展开全部

收起

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。