一、概述
style transfer是CNN的一个相当有趣的应用,它的特别之处在于训练对象是图片而不是权重。详细介绍可以参考下面两篇博客:
简单来说,现在有两张图片:content image和style image,我们需要做的就是找到一张图片,让它的内容(content)与content image越接近越好,画风(style)与style image越接近越好。衡量标准是content loss和style loss,二者的加权为total loss,我们需要训练神经网络,降低total loss。
这是Stanford的tensorflow课程的第二个大作业,主要有三个文件:
style_transfer.py:主要的文件,用来建立和训练模型。待完成。
load_vgg.py:载入已训练的vgg模型。待完成。
util.py:必要的基本功能。无需修改。
完整参考代码:style_transfer
二、实现
STEP 0: Initialize
def __init__(self, content_img, style_img, img_width, img_height):
'''
img_width and img_height are the dimensions we expect from the generated image.
We will resize input content image and input style image to match this dimension.
Feel free to alter any hyperparameter here and see how it affects your training.
'''
self.img_width = img_width
self.img_height = img_height
self.content_img = utils.get_resized_image(content_img, img_width, img_height)
self.style_img = utils.get_resized_image(style_img, img_width, img_height)
self.initial_img = utils.generate_noise_image(self.content_img, img_width, img_height)
###############################
## TO DO
## create global step (gstep) and hyperparameters for the model
self.content_layer = 'conv4_2'
self.style_layers = ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv5_1']
self.content_w = 0.01
self.style_w = 1
self.style_layer_w = [0.5, 1.0, 1.5, 3.0, 4.0]
self.gstep = tf.Variable(0, dtype=tf.int32,
trainable=False, name='global_step')
self.lr = 2.0
###############################
STEP 1:Define Interence
我们使用的是CNN模型,每层卷积层后跟着ReLU,并采用average pooling。
def conv2d_relu(self, prev_layer, layer_idx, layer_name):
""" Return the Conv2D layer with RELU using the weights,
biases from the VGG model at 'layer_idx'.
Don't forget to apply relu to the output from the convolution.
Inputs:
prev_layer: the output tensor from the previous layer
layer_idx: the index to current layer in vgg_layers
layer_name: the string that is the name of the current layer.
It's used to specify variable_scope.
Note that you first need to obtain W and b from from the corresponding VGG's layer
using the function _weights() defined above.
W and b returned from _weights() are numpy arrays, so you have