语义分割实战——基于MODNet神经网络人像精细分割系统源码

AI街潜水的八角

已于 2024-11-20 09:12:09 修改

阅读量473

点赞数 5

分类专栏：语义分割实战文章标签：神经网络人工智能深度学习

于 2024-11-19 19:08:10 首次发布

本文链接：https://blog.youkuaiyun.com/u013289254/article/details/143265167

版权

第一步：准备数据

人像精细分割数据，可分割出头发丝，为PPM-100开源数据

第二步：搭建模型

MODNet网络结构如图所示，主要包含3个部分：semantic estimation（S分支）、detail prediction（D分支）、semantic-detail fusion（F分支）。

网络结构简单描述一下：

输入一幅图像I，送入三个模块：S、D、F；
S模块：在低分辨率分支进行语义估计，在backbone最后一层输出接上e-ASPP得到语义feature map Sp；
D模块：在高分辨率分支进行细节预测，通过融合来自低分辨率分支的信息得到细节feature map Dp；
F模块：融合来自低分辨率分支和高分辨率分支的信息，得到alpha matte ap；
对S、D、F模块，均使用来自GT的显式监督信息进行监督训练。

第三步：代码

1）损失函数为：L2损失

2）网络代码：

import torch
import torch.nn as nn
import torch.nn.functional as F

from .backbones import SUPPORTED_BACKBONES


#------------------------------------------------------------------------------
#  MODNet Basic Modules
#------------------------------------------------------------------------------

class IBNorm(nn.Module):
    """ Combine Instance Norm and Batch Norm into One Layer
    """

    def __init__(self, in_channels):
        super(IBNorm, self).__init__()
        in_channels = in_channels
        self.bnorm_channels = int(in_channels / 2)
        self.inorm_channels = in_channels - self.bnorm_channels

        self.bnorm = nn.BatchNorm2d(self.bnorm_channels, affine=True)
        self.inorm = nn.InstanceNorm2d(self.inorm_channels, affine=False)
        
    def forward(self, x):
        bn_x = self.bnorm(x[:, :self.bnorm_channels, ...].contiguous())
        in_x = self.inorm(x[:, self.bnorm_channels:, ...].contiguous())

        return torch.cat((bn_x, in_x), 1)


class Conv2dIBNormRelu(nn.Module):
    """ Convolution + IBNorm + ReLu
    """

    def __init__(self, in_channels, out_channels, kernel_size, 
                 stride=1, padding=0, dilation=1, groups=1, bias=True, 
                 with_ibn=True, with_relu=True):
        super(Conv2dIBNormRelu, self).__init