文章目录
Abstract
与通用语义分割任务相比,Portrait分割需要更高的精度
和更快的速度
。
Boundary-Aware Network (BANet) 选择性提取边界区域的细节信息获得高质量的分割结果,且可做到实时分割( ≥ 25FPS)
同时BANet设计了一个refine loss对网络中的图像级梯度信息进行监督。
BANet is an efficient network with only 0.62MB
parameters, it achieves 43 fps on 512 × 512 images
with high-quality results which are finer than annotations
.
1、Background
精细的边界细节信息丢失问题主要有以下两个原因造成:
一方面,深度学习模型的性能非常依赖于训练数据。然后target通常用polygons来标注或通过KNN-matting生成。故像头发丝这种极其精细的边界细节很难被标注出来。
另一方面,传统语义分割任务主要解决复杂场景中的intra-class consistency and the inter-class distinction
问题。Portrait分割属于二分类问题,传统语义分割模型并不合适。
2、Method of BANet
In the task of portrait segmentation, no-boundary area
needs a large receptive field
to make prediction with global context information
, while boundary area
needs small receptive field to focus on local feature contrast
. Hence these two areas need to be treated independently.
In this paper, we propose a boundary attention mechanism and a weighted lossfunction to deal with boundary area and no-boundary area separately.
2.1、Network Architecture
Semantic Branch: 获得比较大的感受野,提高对非边界区域的分割。channel数最大仅为64;
Fusion Part: 采用BiSeNet中的FFM模块,channel attention mechanism
Boundary Feature Mining Branch
semantic branch的输出先通过1 × 1 conv
映射到一个通道,然后再上采样到原图大小,作为boundary attention map
。

BA loss引导boundary attention map
定位边界区域。
Extraction of semantic boundary forces the network to learn a feature with strong inter-class distinction ability.
BA loss的target并不需要手动标注。先使用Canny边缘检测器检测portrait annotation(ground truth),然后检测结果作为BA loss的target
。实现可参考:Canny
最后,输入图像与attention map拼接得到一个4维的特征图。
2.2、Loss Function
未完待续…