实验一:输入是一张4*4的图片,通道为1,卷积核为2*2的,卷积方式为valid,输出为3*3的图片
import tensorflow as tf
a=tf.constant([
[[1.0,2.0,3.0,4.0],
[5.0,6.0,7.0,8.0],
[8.0,7.0,6.0,5.0],
[4.0,3.0,2.0,1.0]]
])
print(a.shape)
a=tf.reshape(a,[1,4,4,1])
print(a.shape)
b=tf.constant([[[1.0,4.0],
[2.0,3.0]]])
print(b.shape)
b=tf.reshape(b,[2,2,1,1])
print(b.shape)
output=tf.nn.conv2d(a,b,strides=[1,1,1,1],padding='VALID')
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
image=sess.run(a)
print(image)
filter=sess.run(b)
print(filter)
print(sess.run(output))
print(output.shape)
首先有一张图,
1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 |
8 | 7 | 6 | 5 |
4 | 3 | 2 | 1 |
这张图片的shape为(1,4,4)
reshape这张图片,使之变成4维张量(1,4,4,1)
表示为
[[[[ 1.]
[ 2.]
[ 3.]
[ 4.]]
[[ 5.]
[ 6.]
[ 7.]
[ 8.]]
[[ 8.]
[ 7.]
[ 6.]
[ 5.]]
[[ 4.]
[ 3.]
[ 2.]
[ 1.]]]]
卷积核
1 | 4 |
2 | 3 |
它的shape为(1,2,2),reshape为(2,2,1,1)
表示为
[[[[ 1.]]
[[ 4.]]]
[[[ 2.]]
[[ 3.]]]]
卷积之后的结果
37 | 47 | 57 |
66 | 66 | 66 |
53 | 43 | 33 |
表示为(1,3,3,1)
[[[[ 37.]
[ 47.]
[ 57.]]
[[ 66.]
[ 66.]
[ 66.]]
[[ 53.]
[ 43.]
[ 33.]]]]
实验二:增加图片的通道数,变成2个通道,此时卷积核的inchannels为2,outchannels为1.输出的图片一张,3*3
import tensorflow as tf
a = tf.constant([
[[1.0, 2.0, 2.0, 4.0],
[3.0, 6.0, 4.0, 8.0],
[5.0, 7.0, 6.0, 5.0],
[7.0, 3.0, 8.0, 1.0]],
[[8.0, 3.0, 7.0, 1.0],
[6.0, 7.0, 5.0, 5.0],
[4.0, 2.0, 3.0, 4.0],
[2.0, 6.0, 1.0, 8.0]]
])
print(a.shape)
a = tf.reshape(a, [1, 4, 4, 2])
print(a.shape)
b = tf.constant([[[1.0, 3.0],
[4.0, 4.0]],
[[2.0, 5.0],
[3.0, 6.0]]
])
print(b.shape)
b = tf.reshape(b, [2, 2, 2, 1])
print(b.shape)
output = tf.nn.conv2d(a, b, strides=[1, 1, 1, 1], padding='VALID')
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
image = sess.run(a)
print('image:')
print(image)
filter = sess.run(b)
print('filter:')
print(filter)
print('output:')
print(sess.run(output))
print(output.shape)
中间出现了这个错误:
TypeError: list indices must be integers, not tuple
是因为少加了一个逗号。
首先,输入图片为双通道
1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 |
8 | 7 | 6 | 5 |
4 | 3 | 2 | 1 |
2 | 4 | 6 | 8 |
7 | 5 | 3 | 1 |
3 | 1 | 7 | 5 |
2 | 4 | 6 | 8 |
这张图片的shape为(2,4,4)注意是双通道交替
reshape变成四维张量(1,4,4,2)
表示为 注意这里的每一列代表一个通道
image:
[[[[ 1. 2.]
[ 2. 4.]
[ 3. 6.]
[ 4. 8.]]
[[ 5. 7.]
[ 6. 5.]
[ 7. 3.]
[ 8. 1.]]
[[ 8. 3.]
[ 7. 1.]
[ 6. 7.]
[ 5. 5.]]
[[ 4. 2.]
[ 3. 4.]
[ 2. 6.]
[ 1. 8.]]]]
卷积核为
1 | 4 |
2 | 3 |
若直接表示
b = tf.constant([[[1.0, 4.0],
[2.0, 3.0]]
])
会出现以下错误:
ValueError: Cannot reshape a tensor with 4 elements to shape [2,2,2,1] (8 elements) for 'Reshape_1' (op: 'Reshape') with input shapes:
这是因为要把卷积核reshape成(2,2,2,1)必须要有8个元素,但是生成的b只有四个元素,所以要添加元素。
改成b = tf.constant([[[1.0, 1.0],
[4.0, 4.0]],
[[2.0,2.0],
[3.0,3.0]]
])
卷积之后的结果,是图片的每个通道和卷积核卷积之后的叠加。
通道一与卷积核的结果:
37 | 47 | 57 |
66 | 66 | 66 |
53 | 43 | 33 |
通道二与卷积核的结果:
47 | 47 | 47 |
36 | 40 | 36 |
23 | 55 | 63 |
两个叠加之后的结果即为卷积后的结果:
84 | 94 | 104 |
102 | 106 | 102 |
76 | 98 | 96 |
实验三:把上面两个实验的卷积方式由VALID换成SAME
根据padding的公式:
new_height = new_width = W / S (结果向上取整)
在高度上需要pad的像素数为
pad_needed_height = (new_height – 1) × S + F - W
输入矩阵上方添加的像素数为
pad_top = pad_needed_height / 2 (因为int去掉小数结果取整)
下方添加的像素数为
pad_down = pad_needed_height - pad_top
以此类推,在宽度上需要pad的像素数和左右分别添加的像素数为
pad_needed_width = (new_width – 1) × S + F - W
pad_left = pad_needed_width / 2 (结果取整)
pad_right = pad_needed_width – pad_left
所以计算过程:
new_height=new_width=4/1=4
pad_needed_height=(4-1)*1+2-4=1
矩阵上方添加的像素数为pad_top=1/2=0
下方添加的像素数为pad_down=1
同理,矩阵左方添加的像素数为pad_left=0
下方添加的像素数为pad_right=1
对于实验一中的单通道图片,经过padding,变成5*5
1 | 2 | 3 | 4 | 0 |
5 | 6 | 7 | 8 | 0 |
8 | 7 | 6 | 5 | 0 |
4 | 3 | 2 | 1 | 0 |
0 | 0 | 0 | 0 | 0 |
卷积核仍为
1 | 4 |
2 | 3 |
卷积之后的结果为4*4
37 | 47 | 57 | 20 |
66 | 66 | 66 | 18 |
53 | 43 | 33 | 7 |
16 | 11 | 6 | 1 |
表示为
output:
[[[[ 37.]
[ 47.]
[ 57.]
[ 20.]]
[[ 66.]
[ 66.]
[ 66.]
[ 18.]]
[[ 53.]
[ 43.]
[ 33.]
[ 7.]]
[[ 16.]
[ 11.]
[ 6.]
[ 1.]]]]
对于实验二,同理对双通道进行同样的操作,然后叠加,
84 | 84 | 104 | 30 |
102 | 106 | 102 | 29 |
76 | 98 | 96 | 28 |
34 | 39 | 44 | 9 |
output:
[[[[ 84.]
[ 94.]
[ 104.]
[ 30.]]
[[ 102.]
[ 106.]
[ 102.]
[ 29.]]
[[ 76.]
[ 98.]
[ 96.]
[ 28.]]
[[ 34.]
[ 39.]
[ 44.]
[ 9.]]]]
实验四:对实验一和实验二的图片进行池化操作,这里用最大池化
一、首先单通道的图片
1 | 2 | 3 | 4 |
7 | 6 | 7 | 8 |
8 | 7 | 6 | 5 |
4 | 3 | 2 | 1 |
若padding=VALID
池化后的图片为
6 | 7 | 8 |
8 | 7 | 8 |
8 | 7 | 6 |
pooling:
[[[[ 6.]
[ 7.]
[ 8.]]
[[ 8.]
[ 7.]
[ 8.]]
[[ 8.]
[ 7.]
[ 6.]]]]
若padding=VALID
1 | 2 | 3 | 4 | 0 |
5 | 6 | 7 | 8 | 0 |
8 | 7 | 6 | 5 | 0 |
4 | 3 | 2 | 1 | 0 |
0 | 0 | 0 | 0 | 0 |
池化后的图片为
6 | 7 | 8 | 8 |
8 | 7 | 8 | 8 |
8 | 7 | 6 | 5 |
4 | 3 | 2 | 1 |
pooling:
[[[[ 6.]
[ 7.]
[ 8.]
[ 8.]]
[[ 8.]
[ 7.]
[ 8.]
[ 8.]]
[[ 8.]
[ 7.]
[ 6.]
[ 5.]]
[[ 4.]
[ 3.]
[ 2.]
[ 1.]]]]
二、其次双通道的图片
1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 |
8 | 7 | 6 | 5 |
4 | 3 | 2 | 1 |
2 | 4 | 6 | 8 |
7 | 5 | 3 | 1 |
3 | 1 | 7 | 5 |
2 | 4 | 6 | 8 |
import tensorflow as tf
a = tf.constant([
[[1.0, 2.0, 2.0, 4.0],
[3.0, 6.0, 4.0, 8.0],
[5.0, 7.0, 6.0, 5.0],
[7.0, 3.0, 8.0, 1.0]],
[[8.0, 3.0, 7.0, 1.0],
[6.0, 7.0, 5.0, 5.0],
[4.0, 2.0, 3.0, 4.0],
[2.0, 6.0, 1.0, 8.0]]
])
print(a.shape)
a = tf.reshape(a, [1, 4, 4, 2])
print(a.shape)
pooling=tf.nn.max_pool(a,[1,2,2,1],[1,1,1,1],padding='SAME')
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
image = sess.run(a)
print('image:')
print(image)
print('pooling:')
print(sess.run(pooling))
print(pooling.shape)
pool窗口为2*2,步长为1,
若padding=VALID,
池化后的图也为双通道
6 | 7 | 8 |
8 | 7 | 8 |
8 | 7 | 6 |
7 | 6 | 8 |
7 | 7 | 7 |
4 | 7 | 8 |
pooling:
[[[[ 6. 7.]
[ 7. 6.]
[ 8. 8.]]
[[ 8. 7.]
[ 7. 7.]
[ 8. 7.]]
[[ 8. 4.]
[ 7. 7.]
[ 6. 8.]]]]
若padding=SAME,
new_height=new_width=4/1=4
pad_needed_height=(4-1)*1+2-4=1
矩阵上方添加的像素数为pad_top=1/2=0
下方添加的像素数为pad_down=1
同理,矩阵左方添加的像素数为pad_left=0
下方添加的像素数为pad_right=1
所以padding之后的双通道图片为
1 | 2 | 3 | 4 | 0 |
5 | 6 | 7 | 8 | 0 |
8 | 7 | 6 | 5 | 0 |
4 | 3 | 2 | 1 | 0 |
0 | 0 | 0 | 0 | 0 |
2 | 4 | 6 | 8 | 0 |
7 | 5 | 3 | 1 | 0 |
3 | 1 | 7 | 5 | 0 |
2 | 4 | 6 | 8 | 0 |
0 | 0 | 0 | 0 | 0 |
池化后的图片为
6 | 7 | 8 | 8 |
8 | 7 | 8 | 8 |
8 | 7 | 6 | 5 |
4 | 3 | 2 | 1 |
7 | 6 | 8 | 8 |
7 | 7 | 7 | 5 |
4 | 7 | 8 | 8 |
4 | 6 | 8 | 8 |
pooling:
[[[[ 6. 7.]
[ 7. 6.]
[ 8. 8.]
[ 8. 8.]]
[[ 8. 7.]
[ 7. 7.]
[ 8. 7.]
[ 8. 5.]]
[[ 8. 4.]
[ 7. 7.]
[ 6. 8.]
[ 5. 8.]]
[[ 4. 4.]
[ 3. 6.]
[ 2. 8.]
[ 1. 8.]]]]