VGG 原理来源
VGG是Oxford的Visual Geometry Group的组提出的(大家应该能看出VGG名字的由来了)。该网络是在ILSVRC 2014上的相关工作,主要工作是证明了增加网络的深度能够在一定程度上影响网络最终的性能。VGG有两种结构,分别是VGG16和VGG19,两者并没有本质上的区别,只是网络深度不一样。
VGG16相比AlexNet的一个改进是采用连续的几个3x3的卷积核代替AlexNet中的较大卷积核(11x11,7x7,5x5)。对于给定的感受野(与输出有关的输入图片的局部大小),采用堆积的小卷积核是优于采用大的卷积核,因为多层非线性层可以增加网络深度来保证学习更复杂的模式,而且代价还比较小(参数更少)。
简单来说,在VGG中,使用了3个3x3卷积核来代替7x7卷积核,使用了2个3x3卷积核来代替5*5卷积核,这样做的主要目的是在保证具有相同感知野的条件下,提升了网络的深度,在一定程度上提升了神经网络的效果。
下边是代码复现:
inputs = tf.keras.Input(shape=(224,224,3))
#两次使用64个3*3的卷积核,采用l2正则化,池化后维度(112,112,64)
x = Conv2D(64,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(inputs)
x = Conv2D(64,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = MaxPool2D((2,2))(x)
#两次使用128个3*3的卷积核,池化后维度(56,56,128)
x = Conv2D(128,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(128,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = MaxPool2D((2,2))(x)
#三次使用256个3*3的卷积核,池化后维度(28,28,256)
x = Conv2D(256,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(256,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(256,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = MaxPool2D((2,2))(x)
#三次使用512个3*3的卷积核,池化后维度(14,14,512)
x = Conv2D(512,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(512,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(512,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = MaxPool2D((2,2))(x)
#三次使用512个3*3的卷积核,池化后维度(7,7,512)
x = Conv2D(512,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(512,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = Conv2D(512,(3,3),strides=1,padding='same',
kernel_regularizer=l2(0.0002),
activation='relu')(x)
x = MaxPool2D((2,2))(x)
x = Flatten()(x)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(1000,activation='softmax')(x)
model = tf.keras.Model(inputs,predictions)
model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
conv2d_28 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
conv2d_29 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 112, 112, 64) 0
_________________________________________________________________
conv2d_30 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
conv2d_31 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 56, 56, 128) 0
_________________________________________________________________
conv2d_32 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
conv2d_33 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
conv2d_34 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
max_pooling2d_13 (MaxPooling (None, 28, 28, 256) 0
_________________________________________________________________
conv2d_35 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv2d_36 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
conv2d_37 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_38 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
conv2d_39 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
conv2d_40 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
max_pooling2d_15 (MaxPooling (None, 7, 7, 512) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_3 (Dense) (None, 4096) 102764544
_________________________________________________________________
dropout_2 (Dropout) (None, 4096) 0
_________________________________________________________________
dense_4 (Dense) (None, 4096) 16781312
_________________________________________________________________
dropout_3 (Dropout) (None, 4096) 0
_________________________________________________________________
dense_5 (Dense) (None, 1000) 4097000
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________