超分辨率网络ESPCN中的pixel shuffle--几种代码思路（基于TF, Pytorch）

最新推荐文章于 2025-05-20 09:29:12 发布

Serrie.

最新推荐文章于 2025-05-20 09:29:12 发布

阅读量4.7k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： DeepLearning pytorch tensorflow

本文链接：https://blog.youkuaiyun.com/qq249356520/article/details/95318645

DeepLearning 同时被 3 个专栏收录

9 篇文章

订阅专栏

tensorflow

9 篇文章

订阅专栏

pytorch

3 篇文章

订阅专栏

博客介绍了ESPCN的pixel shuffle操作的两种方法。方法一源码来自github，先搭建ESPCN网络结构，重点在split和concat中进行pixel的拆分与重组；方法二同样来自github，简单粗暴地直接打乱重组，且提到pytorch官方提供了pixel shuffle方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

方法一：

源码来自github：https://github.com/JuheonYi/VESPCN-tensorflow 中 ESPCN部分

首先简单的来看ESPCN的网络结构搭建 conv--conv--conv--ps

    def network(self, LR):
        feature_tmp = tf.layers.conv2d(LR, 64, 5, strides=1, padding='SAME', name='CONV_1',
                                kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=tf.AUTO_REUSE)
        feature_tmp = tf.nn.relu(feature_tmp)

        feature_tmp = tf.layers.conv2d(feature_tmp, 32, 3, strides=1, padding='SAME', name='CONV_2',
                                kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=tf.AUTO_REUSE)
        feature_tmp = tf.nn.relu(feature_tmp)

        feature_out = tf.layers.conv2d(feature_tmp, self.channels*self.scale*self.scale, 3, strides=1, padding='SAME', 
                            name='CONV_3', kernel_initializer = tf.contrib.layers.xavier_initializer())

        feature_out = PS(feature_out, self.scale, color=False)

        feature_out = tf.layers.conv2d(feature_out, 1, 1, strides=1, padding='SAME', 
                        name = 'CONV_OUT', kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=tf.AUTO_REUSE)
        return feature_out

其中PS操作便是pixel shuffle

PS操作：其实就是将H * W * C * r * r ==> rH * rW * C 将其从H * W 放大为 rH * rW

def PS(X, r, color=False):
    #print("Input X shape:",X.get_shape(),"scale:",r)
    if color:
        Xc = tf.split(X, 3, 3)
        X = tf.concat([_phase_shift(x, r) for x in Xc], 3)   #each of x in Xc is r * r channel 分别每一个通道变为r*r
    else:
        X = _phase_shift_1dim(X, r)
    #print("output X shape:",X.get_shape())
    return X

tf.split方法请移步tensorflow API：https://www.tensorflow.org/api_docs/python/tf/split 或者直接google

总之结果就是得到一个Xc（三通道，每一通道为H * W * r * r）随后分辨遍历每一个通道将r 与H W混合（shuffle）

具体操作：

def _phase_shift(I, r):
    bsize, w, h, c = I.get_shape().as_list()
    bsize = tf.shape(I)[0]
    X = tf.reshape(I, (bsize, w, h, r, r))
    X = tf.split(X, w, 1)   #在w通道上分成了w份， 将每一维分成了1
    #tf.squeeze删除axis上的1，然后在第三通道 即r通道上 将w个小x重新级联变成r * w
    X = tf.concat([tf.squeeze(x, axis=1) for x in X], 2)  #最终变成 bsize, h, r * w, r
    X = tf.split(X, h, 1)
    X = tf.concat([tf.squeeze(x, axis=1)for x in X], 2)

    return tf.reshape(X, (bsize, w * r, h * r, 1))  #最后变成这个shape



def _phase_shift_1dim(I, r):
    bsize, h, w, c = I.shape
    bsize = I.shape[0]

    X = tf.reshape(I, (bsize, h, w, r, r))
    X = tf.split(X, w, 1)
    X = tf.concat([tf.squeeze(x, axis=1) for x in X], 2)
    X = tf.split(X, h, 1)
    X = tf.concat([tf.squeeze(x, axis=1) for x in X], 2)

    return tf.reshape(X, (bsize, w * r, h * r, 1))

其中重点在split和concat中，这两步进行了pixel的拆分与重组将a变为r * a ，b同理。

方法二：

来自：https://github.com/drakelevy/ESPCN-TensorFlow

shuffle操作如下：

def shuffle(input_image, ratio):
    shape = input_image.shape
    height = int(shape[0]) * ratio
    width = int(shape[1]) * ratio
    channels = int(shape[2]) // ratio // ratio
    shuffled = np.zeros((height, width, channels), dtype=np.uint8)
    for i in range(0, height):
        for j in range(0, width):
            for k in range(0, channels):
                #每一个像素 都是三通道叠加
                shuffled[i,j,k] = input_image[i // ratio, j // ratio, k * ratio * ratio + (i % ratio) * ratio + (j % ratio)]
    return shuffled

简单粗暴直接打乱重组直接根据原图拼接一张新图片（使用python的思想来理解，一个三维数组，分别对每一维度，即每一个数组进行处理），每一个像素点分别控制。

而在pytorch在中：官方提供了pixel shuffle方法：

CLASS torch.nn.PixelShuffle(upscale_factor)