DenseASPP for Semantic Segmentation in Street Scenes(DenseASPP 用于街景语义分割)的阅读笔记

原文地址:DenseASPP

收录:CVPR2018

代码: PyTorch​​​​​​


目录

摘要

1.Introduction

2.DenseASPP

3.DenseASPP的细节

3.1空洞卷积与 pyramid pooling

3.2 更加密集的特征金字塔和更大的 receptive field

3.2.1 更加密集的特征金字塔

         Denser scale sampling:

         Denser pixel sampling:

3.2.2 更大的 receptive field

3.3 模型规模控制

4.实验结果

4.1 实践细节

4.2 消融实验 

1.Feature similarities 特征相似性

2.Visualization of receptive field 感受野可视化

3. Illustration of scale diversity 规模多样性的阐述


设计的Idea:DenseNet 实际上是 DenseNet + ASPP(Deeplab)的结合体

DenseNet can be viewed as a special case of DenseASPP by setting dilation rate as 1. 因此,DenseASPP 拥有DenseNet的优点: alleviating gradient-vanishing problem 和substantially fewer parameters


摘要

objects in autonomous driving exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded.

如图一,人的大小在变化;图二中,离的很近的公交车,非常远的小车 

为了解决这个问题,atrous convolution 被提出。 Atrous Spatial Pyramid Pooling (ASPP) in DeepLab V3 was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation.

But feature resolution in the scale-axis is not dense enough for the autonomous driving scenario.

So  we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size.


1.Introduction

高级特征对我们的分割很有作用。To extract high level information, FCN uses multiple pooling layers to increase the receptive field size of an output neuron.但是做下采样和池化,会降低图片分辨率。 However, increased number of pooling layers leads to reduced feature map size, which poses serious challenges to up-sample the segmentation output back to full resolution. 另一外面,我们又不能不增大感受野。 if we output the segmentation from an early layer with larger resolution, we were not able to make use of higher level semantics for better reasoning.

这个时候空洞卷积就派上用场了。A feature map produced by an atrous convolution can be as the same size as the input, but with each output neuron possessing a larger receptive field, and therefore encoding higher level semantics.

但空洞卷积还是有缺点的:1. 生成单一scale的特征图。all neurons in the atrous-convolved feature map share the same receptive field size, which means the process of semantic mask generation only made use of features from a single scale. 这一点上ASPP能解决

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值