Image Super-Resolution Using Very Deep Residual Channel Attention Networks

本文深入解析了RCAN网络,一种用于超分辨率问题的深度学习模型。RCAN通过引入Residual in Residual结构和Channel Attention机制,实现了对图像特征的有效提取和精细调整。文章详细介绍了RCAN的网络架构、RIR模块的工作原理以及Channel Attention如何提升模型性能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一,贡献

贡献有三点

  • 提出了一个非常深的residual channel attention networks (RCAN),用于超分辨率问题
  • 提出了residual in residual (RIR)结构,用来构建非常深的能有效训练的网络,long skip 和short skip连接
  • 使用channel attention(CA),来rescale feature

二,方法

1,网络架构在这里插入图片描述

输入低分辨率图片,首先过一个卷积层
在这里插入图片描述
F0为浅层特征,然后过一个RIR模块来提取深层次特征
在这里插入图片描述
然后使用上采样模块
在这里插入图片描述
然后过the reconstruction layer
在这里插入图片描述
损失为L1 loss
在这里插入图片描述

2, Residual in Residual (RIR)

看网络架构,RIR包含G个residual groups (RG) 和a long skip connection (LSC),每个RG包含B个residual channel attention blocks (RCAB) with short skip
connection (SSC),
RCAB: 三层卷积+CA+残差

## Residual Channel Attention Block (RCAB)
class RCAB(nn.Module):
    def __init__(
        self, conv, n_feat, kernel_size, reduction,
        bias=True, bn=False, act=nn.ReLU(True), res_scale=1):

        super(RCAB, self).__init__()
        modules_body = []
        for i in range(2):
            modules_body.append(conv(n_feat, n_feat, kernel_size, bias=bias))
            if bn: modules_body.append(nn.BatchNorm2d(n_feat))
            if i == 0: modules_body.append(act)
        modules_body.append(CALayer(n_feat, reduction))
        self.body = nn.Sequential(*modules_body)
        self.res_scale = res_scale

    def forward(self, x):
        res = self.body(x)
        #res = self.body(x).mul(self.res_scale)
        res += x
        return res
3,chanel attention(CA)

Channel Attention (CA) Layer
使用通道注意力机制要考虑两点

  • 有能力学习到通道之间的非线性关系
  • Second, as multiple channel-wise features can be emphasized opposed to one-hot activation, it must learn a non-mututually-exclusive relationship.(不太理解)

先过一个全局平均池化层
在这里插入图片描述
使用门机制
在这里插入图片描述

WDW_{D}WD为一个卷积层,使用reduction ratio r进行通道下采样,WUW_{U}WU也是一个卷积层,用ratio r来通道上采样,

进行channel 级别的rescale
在这里插入图片描述
代码如下

class CALayer(nn.Module):
    def __init__(self, channel, reduction=16):
        super(CALayer, self).__init__()
        # global average pooling: feature --> point
        self.avg_pool = nn.AdaptiveAvgPool2d(1) 
        # feature channel downscale and upscale --> channel weight
        self.conv_du = nn.Sequential(
                nn.Conv2d(channel, channel // reduction, 1, padding=0, bias=True), ##通道下采样
                nn.ReLU(inplace=True),
                nn.Conv2d(channel // reduction, channel, 1, padding=0, bias=True), ##通道上采样
                nn.Sigmoid()
        )

    def forward(self, x):
        y = self.avg_pool(x)
        y = self.conv_du(y)
        return x * y
Image super-resolution (SR) is the process of increasing the resolution of a low-resolution (LR) image to a higher resolution (HR) version. This is an important task in computer vision and has many practical applications, such as improving the quality of images captured by low-resolution cameras or enhancing the resolution of medical images. However, most existing SR methods suffer from a loss of texture details and produce overly smooth HR images, which can result in unrealistic and unappealing results. To address this issue, a new SR method called Deep Spatial Feature Transform (DSFT) has been proposed. DSFT is a deep learning-based approach that uses a spatial feature transform layer to recover realistic texture in the HR image. The spatial feature transform layer takes the LR image and a set of HR feature maps as input and transforms the features to a higher dimensional space. This allows the model to better capture the high-frequency details in the image and produce more realistic HR images. The DSFT method also employs a multi-scale approach, where the LR image is processed at multiple scales to capture both local and global features. Additionally, the model uses residual connections to improve the training process and reduce the risk of overfitting. Experimental results show that DSFT outperforms state-of-the-art SR methods in terms of both quantitative metrics and visual quality. The method is also shown to be robust to different noise levels and image degradation scenarios. In summary, DSFT is a promising approach for realistic texture recovery in image super-resolution. Its ability to capture high-frequency details and produce visually appealing HR images makes it a valuable tool for various applications in computer vision.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值