第一步:准备数据
人像精细分割数据,可分割出头发丝,为PPM-100开源数据
第二步:搭建模型
MODNet网络结构如图所示,主要包含3个部分:semantic estimation(S分支)、detail prediction(D分支)、semantic-detail fusion(F分支)。
网络结构简单描述一下:
输入一幅图像I,送入三个模块:S、D、F;
S模块:在低分辨率分支进行语义估计,在backbone最后一层输出接上e-ASPP得到语义feature map Sp;
D模块:在高分辨率分支进行细节预测,通过融合来自低分辨率分支的信息得到细节feature map Dp;
F模块:融合来自低分辨率分支和高分辨率分支的信息,得到alpha matte ap;
对S、D、F模块,均使用来自GT的显式监督信息进行监督训练。
第三步:代码
1)损失函数为:L2损失
2)网络代码:
import torch
import torch.nn as nn
import torch.nn.functional as F
from .backbones import SUPPORTED_BACKBONES
#------------------------------------------------------------------------------
# MODNet Basic Modules
#------------------------------------------------------------------------------
class IBNorm(nn.Module):
""" Combine Instance Norm and Batch Norm into One Layer
"""
def __init__(self, in_channels):
super(IBNorm, self).__init__()
in_channels = in_channels
self.bnorm_channels = int(in_channels / 2)
self.inorm_channels = in_channels - self.bnorm_channels
self.bnorm = nn.BatchNorm2d(self.bnorm_channels, affine=True)
self.inorm = nn.InstanceNorm2d(self.inorm_channels, affine=False)
def forward(self, x):
bn_x = self.bnorm(x[:, :self.bnorm_channels, ...].contiguous())
in_x = self.inorm(x[:, self.bnorm_channels:, ...].contiguous())
return torch.cat((bn_x, in_x), 1)
class Conv2dIBNormRelu(nn.Module):
""" Convolution + IBNorm + ReLu
"""
def __init__(self, in_channels, out_channels, kernel_size,
stride=1, padding=0, dilation=1, groups=1, bias=True,
with_ibn=True, with_relu=True):
super(Conv2dIBNormRelu, self).__init