【Pytorch】Visualization of Feature Maps（4）——Saliency Maps

PyTorch实现SaliencyMaps原理与视觉化

最新推荐文章于 2024-02-17 22:11:57 发布

原创

最新推荐文章于 2024-02-17 22:11:57 发布 · 1.9k 阅读

34 ·

CC 4.0 BY-SA版权

文章标签：

#pytorch #人工智能 #python #saliency maps

本文介绍了如何使用PyTorch实现SaliencyMaps，一种用于可视化深度学习图像分类模型的方法，通过计算每个像素对分类得分的影响程度。作者展示了从加载数据到生成和显示SaliencyMaps的完整过程，包括预处理、模型计算和可视化示例。

在这里插入图片描述

学习参考来自

Saliency Maps的原理与简单实现(使用Pytorch实现)
https://github.com/wmn7/ML_Practice/tree/master/2019_07_08/Saliency%20Maps

Saliency Maps 原理

《Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps》（arXiv-2013）

在这里插入图片描述

A saliency map tells us the degree to which each pixel in the image affects the classification score for that image.
To compute it, we compute the gradient of the unnormalized score corresponding to the correct class (which is a scalar)
with respect to the pixels of the image. If the image has shape (3, H, W) then this gradient will also have shape (3, H, W);
for each pixel in the image, this gradient tells us the amount by which the classification score will change if the pixel
changes by a small amount. To compute the saliency map, we take the absolute value of this gradient, then take the maximum value over the 3 input channels; the final saliency map thus has shape (H, W) and all entries are non-negative.

Saliency Maps相当于是计算图像的每一个pixel是如何影响一个分类器的, 或者说分类器对图像中每一个pixel哪些认为是重要的.

会计算图像每一个像素点的梯度。如果图像的形状是(3, H, W)，这个梯度的形状也是(3, H, W)；对于图像中的每个像素点，
这个梯度告诉我们当像素点发生轻微改变时，正确分类分数变化的幅度。

计算 saliency map 的时候，需要计算出梯度的绝对值，然后再取三个颜色通道的最大值；

因此最后的 saliency map的形状是(H, W)为一个通道的灰度图。

直接来代码，先载入些数据，用的是 cs231n 作业里面的 imagenet_val_25.npz，含有 imagenet 数据中验证集的 25 张图片

import torch
import torchvision
import torchvision.transforms as T
import matplotlib.pyplot as plt
import numpy as np
import os
from PIL import Image

SQUEEZENET_MEAN = np.array([0.485, 0.456, 0.406], dtype=np.float32)
SQUEEZENET_STD = np.array([0.229, 0.224, 0.225], dtype=np.float32)

def load_imagenet_val(num=None):
    """Load a handful of validation images from ImageNet.
    Inputs:
    - num: Number of images to load (max of 25)
    Returns:
    - X: numpy array with shape [num, 224, 224, 3]
    - y: numpy array of integer image labels, shape [num]
    - class_names: dict mapping integer label to class name
    """
    imagenet_fn = 'imagenet_val_25.npz'
    if not os.path.isfile(imagenet_fn):
      print('file %s not found' % imagenet_fn)
      print('Run the following:')
      print('cd cs231n/datasets')
      print('bash get_imagenet_val.sh')

最低0.47元/天解锁文章