【IQA技术专题】GMSD代码讲解

最新推荐文章于 2025-12-16 22:21:16 发布

原创最新推荐文章于 2025-12-16 22:21:16 发布 · 735 阅读

9 ·

CC 4.0 BY-SA版权

文章标签：

#python #IQA #全参考图像评价

IQA 专栏收录该内容

22 篇文章

订阅专栏

部署运行你感兴趣的模型镜像

本文是对GMSD图像质量评价指标的代码解读，原文解读请看GMSD文章讲解。
本文的代码来源于IQA-Pytorch工程。

1、原文概要

现有 FR-IQA 模型（如 FSIM）虽精度高，但计算复杂；部分模型（如 SSIM）虽然计算简单，但精度不够。作者想要提出兼具高预测精度和高计算效率的 FR-IQA 模型。实现该指标可以分为2个步骤：
在这里插入图片描述

Local Quality Computation：用于描述局部区域的质量，使用用 3×3 Prewitt 滤波器来进行提取，公式如下所示。 $\mathbf{h}_x = \begin{bmatrix} 1/3 & 0 & -1/3 \\ 1/3 & 0 & -1/3 \\ 1/3 & 0 & -1/3 \end{bmatrix}, \quad \mathbf{h}_y = \begin{bmatrix} 1/3 & 1/3 & 1/3 \\ 0 & 0 & 0 \\ -1/3 & -1/3 & -1/3 \end{bmatrix}$ 提取完x和y方向的梯度后，使用以下公式，获取梯度幅值。 $\mathbf{m}_r(i) = \sqrt{( \mathbf{r} \otimes \mathbf{h}_x )^2 (i) + ( \mathbf{r} \otimes \mathbf{h}_y )^2 (i)} \\ \mathbf{m}_d(i) = \sqrt{( \mathbf{d} \otimes \mathbf{h}_x )^2 (i) + ( \mathbf{d} \otimes \mathbf{h}_y )^2 (i)}$ $\text{m}_{\text r}$ 与 $\text{m}_{\text d}$ 分别是参考图和退化图的梯度幅值。最后再计算相似度就可以得到局部的质量估计了。 $\frac{2\mathbf{m}_r(i)\mathbf{m}_d(i) + c}{\mathbf{m}_r^2(i) + \mathbf{m}_d^2(i) + c}$ 其中 $c$ 是一个用于稳定数值的常数。
Pooling Strategy：使用radient Magnitude Similarity Deviation (GMSD)方法来进行池化提取全局图像质量分数。计算过程中会使用到Gradient Magnitude Similarity Mean (GMSM)，两者的公式表示如下： $\frac{1}{N} \sum_{i=1}^{N} GMS(i)$ $\sqrt{\frac{1}{N} \sum_{i=1}^{N} \big(GMS(i) - GMSM\big)^2}$ 作者这里还给出了GMSD相较GMSM的优势，比如说下图：

其中(a) 原始图像 “Fishing”、其受高斯噪声污染的版本（DMOS = 0.4403；GMSM = 0.8853；GMSD = 0.1420 ）以及它们的梯度相似图。(b) 原始图像 “Flower”、其模糊版本（DMOS = 0.7785；GMSM = 0.8745；GMSD = 0.1946 ）以及它们的梯度相似图。基于人类主观 DMOS，图像 “Fishing” 的质量比图像 “Flower” 高得多，然而二者的GMSM指标接近，而GMSD指标可以有明显差距，因此使用GMSD可以更有效的区分开一些质量分数不一样的数据。

2、代码结构

代码实现位于pyiqa/archs/gmsd_arch.py中：
在这里插入图片描述

3 、核心代码模块

`GMSD` 类

这个类实现了整体的参数传入与函数调用。

@ARCH_REGISTRY.register()
class GMSD(nn.Module):
    r"""Gradient Magnitude Similarity Deviation Metric.
    Args:
        - channels: Number of channels.
        - test_y_channel: bool, whether to use y channel on ycbcr.
    Reference:
        Xue, Wufeng, Lei Zhang, Xuanqin Mou, and Alan C. Bovik.
        "Gradient magnitude similarity deviation: A highly efficient
        perceptual image quality index." IEEE Transactions on Image
        Processing 23, no. 2 (2013): 684-695.
    """

    def __init__(self, channels: int = 3, test_y_channel: bool = True) -> None:
        super(GMSD, self).__init__()
        self.channels = channels
        self.test_y_channel = test_y_channel

    def forward(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
        r"""Args:
        x: A distortion tensor. Shape :math:`(N, C, H, W)`.
        y: A reference tensor. Shape :math:`(N, C, H, W)`.
        Order of input is important.
        """
        assert x.shape == y.shape, (
            f'Input and reference images should have the same shape, but got {x.shape} and {y.shape}'
        )
        score = gmsd(x, y, channels=self.channels, test_y_channel=self.test_y_channel)

        return score

参数包含2个，一个是输入通道的大小，另一个是是否只使用Y通道进行评价，然后调用gmsd函数得到结果。

`gmsd` 函数

实际计算的代码。

def gmsd(
    x: torch.Tensor,
    y: torch.Tensor,
    T: int = 170,
    channels: int = 3,
    test_y_channel: bool = True,
) -> torch.Tensor:
    r"""GMSD metric.
    Args:
        - x: A distortion tensor. Shape :math:`(N, C, H, W)`.
        - y: A reference tensor. Shape :math:`(N, C, H, W)`.
        - T: A positive constant that supplies numerical stability.
        - channels: Number of channels.
        - test_y_channel: bool, whether to use y channel on ycbcr.
    """
    if test_y_channel:
        x = to_y_channel(x, 255)
        y = to_y_channel(y, 255)
        channels = 1
    else:
        x = x * 255.0
        y = y * 255.0

    dx = (
        (torch.Tensor([[1, 0, -1], [1, 0, -1], [1, 0, -1]]) / 3.0)
        .unsqueeze(0)
        .unsqueeze(0)
        .repeat(channels, 1, 1, 1)
        .to(x)
    )
    dy = (
        (torch.Tensor([[1, 1, 1], [0, 0, 0], [-1, -1, -1]]) / 3.0)
        .unsqueeze(0)
        .unsqueeze(0)
        .repeat(channels, 1, 1, 1)
        .to(x)
    )
    aveKernel = torch.ones(channels, 1, 2, 2).to(x) / 4.0

    Y1 = F.conv2d(x, aveKernel, stride=2, padding=0, groups=channels)
    Y2 = F.conv2d(y, aveKernel, stride=2, padding=0, groups=channels)

    IxY1 = F.conv2d(Y1, dx, stride=1, padding=1, groups=channels)
    IyY1 = F.conv2d(Y1, dy, stride=1, padding=1, groups=channels)
    gradientMap1 = torch.sqrt(IxY1**2 + IyY1**2 + 1e-12)

    IxY2 = F.conv2d(Y2, dx, stride=1, padding=1, groups=channels)
    IyY2 = F.conv2d(Y2, dy, stride=1, padding=1, groups=channels)
    gradientMap2 = torch.sqrt(IxY2**2 + IyY2**2 + 1e-12)

    quality_map = (2 * gradientMap1 * gradientMap2 + T) / (
        gradientMap1**2 + gradientMap2**2 + T
    )
    score = torch.std(quality_map.view(quality_map.shape[0], -1), dim=1)

    return score