tf.nn.conv2d

最新推荐文章于 2021-05-14 10:29:59 发布

shirleycyy

最新推荐文章于 2021-05-14 10:29:59 发布

阅读量260

点赞数

CC 4.0 BY-SA版权

分类专栏： Tensorflow 文章标签： tensorflow

本文链接：https://blog.youkuaiyun.com/shirleycyy/article/details/79608599

Tensorflow 专栏收录该内容

4 篇文章

订阅专栏

实验一：输入是一张4*4的图片，通道为1，卷积核为2*2的，卷积方式为valid，输出为3*3的图片

import tensorflow as tf
a=tf.constant([
        [[1.0,2.0,3.0,4.0],
        [5.0,6.0,7.0,8.0],
        [8.0,7.0,6.0,5.0],
        [4.0,3.0,2.0,1.0]]
    ])
print(a.shape)
a=tf.reshape(a,[1,4,4,1])
print(a.shape)
b=tf.constant([[[1.0,4.0],
                [2.0,3.0]]])
print(b.shape)
b=tf.reshape(b,[2,2,1,1])
print(b.shape)
output=tf.nn.conv2d(a,b,strides=[1,1,1,1],padding='VALID')
init=tf.global_variables_initializer()
with tf.Session() as sess:
     sess.run(init)
     image=sess.run(a)
     print(image)
     filter=sess.run(b)
     print(filter)
     print(sess.run(output))
     print(output.shape)

首先有一张图，

1	2	3	4
5	6	7	8
8	7	6	5
4	3	2	1

这张图片的shape为（1,4,4）

reshape这张图片，使之变成4维张量(1,4,4,1)

表示为

[[[[ 1.]

[ 2.]

[ 3.]

[ 4.]]

[[ 5.]

[ 6.]

[ 7.]

[ 8.]]

[[ 8.]

[ 7.]

[ 6.]

[ 5.]]

[[ 4.]

[ 3.]

[ 2.]

[ 1.]]]]

卷积核

1	4
2	3

它的shape为（1,2,2），reshape为（2,2,1,1）

表示为

[[[[ 1.]]

[[ 4.]]]

[[[ 2.]]

[[ 3.]]]]

卷积之后的结果

37	47	57
66	66	66
53	43	33

表示为(1,3,3,1）

[[[[ 37.]

[ 47.]

[ 57.]]

[[ 66.]

[ 66.]

[ 66.]]

[[ 53.]

[ 43.]

[ 33.]]]]

实验二：增加图片的通道数，变成2个通道，此时卷积核的inchannels为2，outchannels为1.输出的图片一张，3*3

import tensorflow as tf
a = tf.constant([
    [[1.0, 2.0, 2.0, 4.0],
     [3.0, 6.0, 4.0, 8.0],
     [5.0, 7.0, 6.0, 5.0],
     [7.0, 3.0, 8.0, 1.0]],
    [[8.0, 3.0, 7.0, 1.0],
     [6.0, 7.0, 5.0, 5.0],
     [4.0, 2.0, 3.0, 4.0],
     [2.0, 6.0, 1.0, 8.0]]
])
print(a.shape)
a = tf.reshape(a, [1, 4, 4, 2])
print(a.shape)
b = tf.constant([[[1.0, 3.0],
                  [4.0, 4.0]],
                 [[2.0, 5.0],
                  [3.0, 6.0]]
                 ])
print(b.shape)
b = tf.reshape(b, [2, 2, 2, 1])
print(b.shape)
output = tf.nn.conv2d(a, b, strides=[1, 1, 1, 1], padding='VALID')
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    image = sess.run(a)
    print('image:')
    print(image)
    filter = sess.run(b)
    print('filter:')
    print(filter)
    print('output:')
    print(sess.run(output))
    print(output.shape)

中间出现了这个错误：

TypeError: list indices must be integers, not tuple

是因为少加了一个逗号。

首先，输入图片为双通道

1	2	3	4
5	6	7	8
8	7	6	5
4	3	2	1

2	4	6	8
7	5	3	1
3	1	7	5
2	4	6	8

这张图片的shape为(2,4,4)注意是双通道交替

reshape变成四维张量(1,4,4,2)

表示为注意这里的每一列代表一个通道

image:

[[[[ 1. 2.]

[ 2. 4.]

[ 3. 6.]

[ 4. 8.]]

[[ 5. 7.]

[ 6. 5.]

[ 7. 3.]

[ 8. 1.]]

[[ 8. 3.]

[ 7. 1.]

[ 6. 7.]

[ 5. 5.]]

[[ 4. 2.]

[ 3. 4.]

[ 2. 6.]

[ 1. 8.]]]]

卷积核为

1	4
2	3

若直接表示

b = tf.constant([[[1.0, 4.0],
                                                                        
[2.0, 3.0]]
                                                                                                    ])

会出现以下错误：

ValueError: Cannot reshape a tensor with 4 elements to shape [2,2,2,1] (8 elements) for 'Reshape_1' (op: 'Reshape') with input shapes:

这是因为要把卷积核reshape成（2,2,2,1）必须要有8个元素，但是生成的b只有四个元素，所以要添加元素。

改成b = tf.constant([[[1.0, 1.0],

[4.0, 4.0]],

[[2.0,2.0],

[3.0,3.0]]

])

卷积之后的结果，是图片的每个通道和卷积核卷积之后的叠加。

通道一与卷积核的结果：

37	47	57
66	66	66
53	43	33

通道二与卷积核的结果：

47	47	47
36	40	36
23	55	63

两个叠加之后的结果即为卷积后的结果：

84	94	104
102	106	102
76	98	96

实验三：把上面两个实验的卷积方式由VALID换成SAME

根据padding的公式：

new_height = new_width = W / S （结果向上取整）

在高度上需要pad的像素数为

pad_needed_height = (new_height – 1) × S + F - W

输入矩阵上方添加的像素数为

pad_top = pad_needed_height / 2 （因为int去掉小数结果取整）

下方添加的像素数为

pad_down = pad_needed_height - pad_top

以此类推，在宽度上需要pad的像素数和左右分别添加的像素数为

pad_needed_width = (new_width – 1) × S + F - W

pad_left = pad_needed_width / 2 （结果取整）

pad_right = pad_needed_width – pad_left

所以计算过程：

new_height=new_width=4/1=4

pad_needed_height=(4-1)*1+2-4=1

矩阵上方添加的像素数为pad_top=1/2=0

下方添加的像素数为pad_down=1

同理，矩阵左方添加的像素数为pad_left=0

下方添加的像素数为pad_right=1

对于实验一中的单通道图片，经过padding，变成5*5

1	2	3	4
5	6	7	8
8	7	6	5
4	3	2	1
0	0	0	0

卷积核仍为

1	4
2	3

卷积之后的结果为4*4

37	47	57	20
66	66	66	18
53	43	33	7
16	11	6	1

表示为

output:

[[[[ 37.]

[ 47.]

[ 57.]

[ 20.]]

[[ 66.]

[ 66.]

[ 18.]]

[[ 53.]

[ 43.]

[ 33.]

[ 7.]]

[[ 16.]

[ 11.]

[ 6.]

[ 1.]]]]

对于实验二，同理对双通道进行同样的操作，然后叠加，

84	84	104	30
102	106	102	29
76	98	96	28
34	39	44	9

output:

[[[[ 84.]

[ 94.]

[ 104.]

[ 30.]]

[[ 102.]

[ 106.]

[ 102.]

[ 29.]]

[[ 76.]

[ 98.]

[ 96.]

[ 28.]]

[[ 34.]

[ 39.]

[ 44.]

[ 9.]]]]

实验四：对实验一和实验二的图片进行池化操作，这里用最大池化

一、首先单通道的图片

1	2	3	4
7	6	7	8
8	7	6	5
4	3	2	1

若padding=VALID

池化后的图片为

6	7	8
8	7	8
8	7	6

pooling:

[[[[ 6.]

[ 7.]

[ 8.]]

[[ 8.]

[ 7.]

[ 8.]]

[[ 8.]

[ 7.]

[ 6.]]]]

若padding=VALID

1	2	3	4
5	6	7	8
8	7	6	5
4	3	2	1
0	0	0	0

池化后的图片为

6	7	8	8
8	7	8	8
8	7	6	5
4	3	2	1

pooling:

[[[[ 6.]

[ 7.]

[ 8.]

[ 8.]]

[[ 8.]

[ 7.]

[ 8.]

[ 8.]]

[[ 8.]

[ 7.]

[ 6.]

[ 5.]]

[[ 4.]

[ 3.]

[ 2.]

[ 1.]]]]

二、其次双通道的图片

1	2	3	4
5	6	7	8
8	7	6	5
4	3	2	1

2	4	6	8
7	5	3	1
3	1	7	5
2	4	6	8

import tensorflow as tf

a = tf.constant([
    [[1.0, 2.0, 2.0, 4.0],
     [3.0, 6.0, 4.0, 8.0],
     [5.0, 7.0, 6.0, 5.0],
     [7.0, 3.0, 8.0, 1.0]],
    [[8.0, 3.0, 7.0, 1.0],
     [6.0, 7.0, 5.0, 5.0],
     [4.0, 2.0, 3.0, 4.0],
     [2.0, 6.0, 1.0, 8.0]]
])
print(a.shape)
a = tf.reshape(a, [1, 4, 4, 2])
print(a.shape)


pooling=tf.nn.max_pool(a,[1,2,2,1],[1,1,1,1],padding='SAME')

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    image = sess.run(a)
    print('image:')
    print(image)
    print('pooling:')
    print(sess.run(pooling))
    print(pooling.shape)

pool窗口为2*2，步长为1，

若padding=VALID，

池化后的图也为双通道