【Keras-DCGAN】MNIST / CIFAR-10

最新推荐文章于 2024-03-28 15:03:15 发布

苏堤春不晓

最新推荐文章于 2024-03-28 15:03:15 发布

阅读量2k

点赞数 3

CC 4.0 BY-SA版权

分类专栏： PyTorch/Keras/Caffe/TensroFlow 文章标签： GAN MLP CNN CIFAR-10

本文链接：https://blog.youkuaiyun.com/bryant_meng/article/details/81210656

PyTorch/Keras/Caffe/TensroFlow 专栏收录该内容

63 篇文章

订阅专栏

原文
在这里插入图片描述
本博客是 One Day One GAN [DAY 2] 的 learning notes！GAN 是用 CNN 搭建的！！！

DCGAN：《Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks》
把 GAN 中的 MLP 换成了 CNN，应用在人脸生成中！

1 GAN 的介绍

有关 GAN 的介绍可以参考【Keras-MLP-GAN】MNIST，更多有关 GAN 的小实验，可以参考【Programming】中的 3.5 小节！

2 DCGAN for MNIST

在这里插入图片描述

2.1 导入必要的库

from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Dropout
from keras.layers import BatchNormalization, Activation, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam

import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import sys

import numpy as np

2.2 搭建 generator

输入 100 维的 noisy，输出（28，28，1）的图片

100 → 128*7*7 reshape 成（7，7，128）→上采样（14，14，128）→ conv（14，14，128）→上采样（28，28，128）→ conv（28，28，64）→ conv（28，28，1）

# build_generator(self)
model = Sequential()

model.add(Dense(128 * 7 * 7, activation="relu", input_dim=100)) # 7,7,128
model.add(Reshape((7, 7, 128)))

model.add(UpSampling2D()) # 14,14,128

model.add(Conv2D(128, kernel_size=3, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(UpSampling2D()) # 28,28,128

model.add(Conv2D(64, kernel_size=3, padding="same")) 
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(Conv2D(1, kernel_size=3, padding="same"))
model.add(Activation("tanh"))

model.summary()

noise = Input(shape=(100,))
img = model(noise)        
        
generator = Model(noise, img)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 6272)              633472    
_________________________________________________________________
reshape_1 (Reshape)          (None, 7, 7, 128)         0         
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 128)       147584    
_________________________________________________________________
batch_normalization_1 (Batch (None, 14, 14, 128)       512       
_________________________________________________________________
activation_1 (Activation)    (None, 14, 14, 128)       0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 28, 28, 128)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 28, 28, 64)        73792     
_________________________________________________________________
batch_normalization_2 (Batch (None, 28, 28, 64)        256       
_________________________________________________________________
activation_2 (Activation)    (None, 28, 28, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 28, 28, 1)         577       
_________________________________________________________________
activation_3 (Activation)    (None, 28, 28, 1)         0         
=================================================================
Total params: 856,193
Trainable params: 855,809
Non-trainable params: 384
_________________________________________________________________

2.3 搭建 discriminator

输入（28，28，1），输出概率值
（28，28，1）→ conv（14，14，32）→ conv（7，7，64）→ padding （8，8，64）→ conv（4，4，128）→ conv（4，4，256）→ flatten 4*4*256 → 1

# build_discriminator
model = Sequential()

model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=(28,28,1), padding="same")) # 14,14,32
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))  # 7,7,64
model.add(ZeroPadding2D(padding=((0,1),(0,1)))) # 8,8,64
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=3, strides=2, padding="same")) # 4,4,128
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(256, kernel_size=3, strides=1, padding="same")) # 4,4,256
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Flatten()) # 4*4*256
model.add(Dense(1, activation='sigmoid'))

model.summary()

img = Input(shape=(28,28,1)) # 输入 （28，28，1）
validity = model(img) # 输出二分类结果

discriminator = Model(img, validity)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 14, 14, 32)        320       
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 14, 14, 32)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 7, 7, 64)          18496     
_________________________________________________________________
zero_padding2d_1 (ZeroPaddin (None, 8, 8, 64)          0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 8, 8, 64)          256       
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 8, 8, 64)          0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 4, 4, 128)         73856     
_________________________________________________________________
batch_normalization_4 (Batch (None, 4, 4, 128)         512       
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 4, 4, 128)         0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 4, 4, 128)         0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 256)         295168    
_________________________________________________________________
batch_normalization_5 (Batch (None, 4, 4, 256)         1024      
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU)    (None, 4, 4, 256)         0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 4, 4, 256)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 4096)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 4097      
=================================================================
Total params: 393,729
Trainable params: 392,833
Non-trainable params: 896
_________________________________________________________________

2.4 compile 模型，对学习过程进行配置

optimizer = Adam(0.0002, 0.5)

# discriminator
discriminator.compile(loss='binary_crossentropy',
                      optimizer=optimizer,
                      metrics=['accuracy'])


# The combined model  (stacked generator and discriminator)
z = Input(shape=(100,))
img = generator(z)
validity = discriminator(img)
# For the combined model we will only train the generator
discriminator.trainable = False

# Trains the generator to fool the discriminator
combined = Model(z, validity)
combined.summary()
combined.compile(loss='binary_crossentropy', 
                 optimizer=optimizer)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         (None, 100)               0         
_________________________________________________________________
model_1 (Model)              (None, 28, 28, 1)         856193    
_________________________________________________________________
model_2 (Model)              (None, 1)                 393729    
=================================================================
Total params: 1,249,922
Trainable params: 855,809
Non-trainable params: 394,113
_________________________________________________________________

2.5 保存生成的图片

def sample_images(epoch):
    r, c = 5, 5
    noise = np.random.normal(0, 1, (r * c, 100))
    gen_imgs = generator.predict(noise)

    # Rescale images 0 - 1
    gen_imgs = 0.5 * gen_imgs + 0.5

    fig, axs = plt.subplots(r, c)
    cnt = 0
    for i in range(r):
        for j in range(c):
            axs[i,j].imshow(gen_imgs[cnt, :,:,0], cmap='gray')
            axs[i,j].axis('off')
            cnt += 1
    fig.savefig("images/%d.png" % epoch)
    plt.close()

2.6 训练

batch_size = 32
sample_interval = 50
# Load the dataset
(X_train, _), (_, _) = mnist.load_data() # (60000,28,28)
# Rescale -1 to 1
X_train = X_train / 127.5 - 1.
X_train = np.expand_dims(X_train, axis=3)  # (60000,28,28,1)
# Adversarial ground truths
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))

for epoch in range(4001):
    # ---------------------
    #  Train Discriminator
    # ---------------------

    # Select a random batch of images
    idx = np.random.randint(0, X_train.shape[0], batch_size) # 0-60000 中随机抽  
    imgs = X_train[idx]
    noise = np.random.normal(0, 1, (batch_size, 100))# 生成标准的高斯分布噪声

    # Generate a batch of new images
    gen_imgs = generator.predict(noise)

    # Train the discriminator
    d_loss_real = discriminator.train_on_batch(imgs, valid) #真实数据对应标签１
    d_loss_fake = discriminator.train_on_batch(gen_imgs, fake) #生成的数据对应标签０
    d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

    # ---------------------
    #  Train Generator
    # ---------------------
    noise = np.random.normal(0, 1, (batch_size, 100))

    # Train the generator (to have the discriminator label samples as valid)
    g_loss = combined.train_on_batch(noise, valid)

    # Plot the progress
    if epoch % 50==0:
        print ("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[1], g_loss))

    # If at save interval => save generated image samples
    if epoch % sample_interval == 0:
        sample_images(epoch)

每隔 50 iteration 打印一次结果，保存一次图片（代码中的 epoch 理解为 iteration）
output

0 [D loss: 1.110515, acc.: 34.38%] [G loss: 0.947779]
50 [D loss: 0.857457, acc.: 56.25%] [G loss: 1.294673]
100 [D loss: 0.849929, acc.: 48.44%] [G loss: 1.071479]
………………
3900 [D loss: 0.715688, acc.: 56.25%] [G loss: 1.014293]
3950 [D loss: 0.772328, acc.: 45.31%] [G loss: 1.014349]
4000 [D loss: 0.596769, acc.: 71.88%] [G loss: 1.198742]

2.7 结果展示

iteration 0
在这里插入图片描述

iteration 50
在这里插入图片描述

iteration 100
在这里插入图片描述

iteration 3900
在这里插入图片描述

iteration 3950
在这里插入图片描述

iteration 4000
在这里插入图片描述

3 DCGAN for CIFAR-10

哈哈，用这套代码改改输入玩玩 cifar-10 看看，MNIST 图片是（28，28，1），cifar 则是（32，32，3）

3.1 导入必要的库

这里要导入 cifar10

from keras.datasets import cifar10
from keras.layers import Input, Dense, Reshape, Flatten, Dropout
from keras.layers import BatchNormalization, Activation, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam

import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import sys

import numpy as np

3.2 搭建 generator

注意 feature map 的 size 即可

# build_generator(self)
model = Sequential()

model.add(Dense(128 * 8 * 8, activation="relu", input_dim=100)) # 8,8,128
model.add(Reshape((8, 8, 128)))

model.add(UpSampling2D()) # 16,16,128

model.add(Conv2D(128, kernel_size=3, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(UpSampling2D()) # 32,32,128

model.add(Conv2D(64, kernel_size=3, padding="same")) 
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(Conv2D(3, kernel_size=3, padding="same"))
model.add(Activation("tanh"))

model.summary()

noise = Input(shape=(100,))
img = model(noise)        
        
generator = Model(noise, img)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 8192)              827392    
_________________________________________________________________
reshape_1 (Reshape)          (None, 8, 8, 128)         0         
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 16, 16, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 128)       147584    
_________________________________________________________________
batch_normalization_1 (Batch (None, 16, 16, 128)       512       
_________________________________________________________________
activation_1 (Activation)    (None, 16, 16, 128)       0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 32, 32, 128)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 32, 64)        73792     
_________________________________________________________________
batch_normalization_2 (Batch (None, 32, 32, 64)        256       
_________________________________________________________________
activation_2 (Activation)    (None, 32, 32, 64)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 32, 32, 3)         1731      
_________________________________________________________________
activation_3 (Activation)    (None, 32, 32, 3)         0         
=================================================================
Total params: 1,051,267
Trainable params: 1,050,883
Non-trainable params: 384
_________________________________________________________________

3.3 搭建 discriminator

注意输入的改变即可

# build_discriminator
model = Sequential()

model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=(32,32,3), padding="same")) # 16,16,32
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))  # 8,8,64
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=3, strides=2, padding="same")) # 4,4,128
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(256, kernel_size=3, strides=1, padding="same")) # 4,4,256
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Flatten()) # 4*4*256
model.add(Dense(1, activation='sigmoid'))

model.summary()

img = Input(shape=(32,32,3)) # 输入 （28，28，1）
validity = model(img) # 输出二分类结果

discriminator = Model(img, validity)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 16, 16, 32)        896       
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 16, 16, 32)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 64)          18496     
_________________________________________________________________
batch_normalization_3 (Batch (None, 8, 8, 64)          256       
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 8, 8, 64)          0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 4, 4, 128)         73856     
_________________________________________________________________
batch_normalization_4 (Batch (None, 4, 4, 128)         512       
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 4, 4, 128)         0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 4, 4, 128)         0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 4, 4, 256)         295168    
_________________________________________________________________
batch_normalization_5 (Batch (None, 4, 4, 256)         1024      
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU)    (None, 4, 4, 256)         0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 4, 4, 256)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 4096)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 4097      
=================================================================
Total params: 394,305
Trainable params: 393,409
Non-trainable params: 896
_________________________________________________________________

3.4 compile 模型，对学习过程进行配置

optimizer = Adam(0.0002, 0.5)

# discriminator
discriminator.compile(loss='binary_crossentropy',
                      optimizer=optimizer,
                      metrics=['accuracy'])


# The combined model  (stacked generator and discriminator)
z = Input(shape=(100,))
img = generator(z)
validity = discriminator(img)
# For the combined model we will only train the generator
discriminator.trainable = False

# Trains the generator to fool the discriminator
combined = Model(z, validity)
combined.summary()
combined.compile(loss='binary_crossentropy', 
                 optimizer=optimizer)

output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_3 (InputLayer)         (None, 100)               0         
_________________________________________________________________
model_1 (Model)              (None, 32, 32, 3)         1051267   
_________________________________________________________________
model_2 (Model)              (None, 1)                 394305    
=================================================================
Total params: 1,445,572
Trainable params: 1,050,883
Non-trainable params: 394,689
_________________________________________________________________

3.5 保存生成的图片

这里要注意修改一下，因为 MNIST 产生的是黑白，CIFAR-10 则产生的是彩色

def sample_images(epoch):
    r, c = 5, 5
    noise = np.random.normal(0, 1, (r * c, 100))
    gen_imgs = generator.predict(noise)

    # Rescale images 0 - 1
    gen_imgs = (0.5 * gen_imgs + 0.5)

    fig, axs = plt.subplots(r, c)
    cnt = 0
    for i in range(r):
        for j in range(c):
            axs[i,j].imshow(gen_imgs[cnt, :,:,:])# 注意这里的改变
            axs[i,j].axis('off')
            cnt += 1
    fig.savefig("images/%d.png" % epoch)
    plt.close()

3.6 训练

训练 50001 个 iteration，每 500 个 iteration 输出一次结果

batch_size = 32
sample_interval = 500
# Load the dataset
(X_train,_),(_,_)=cifar10.load_data()#(50000, 32, 32, 3)
# Rescale -1 to 1
X_train = X_train / 127.5 - 1.
# Adversarial ground truths
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))

for epoch in range(50001):
    # ---------------------
    #  Train Discriminator
    # ---------------------

    # Select a random batch of images
    idx = np.random.randint(0, X_train.shape[0], batch_size) # 0-60000 中随机抽  
    imgs = X_train[idx]
    noise = np.random.normal(0, 1, (batch_size, 100))# 生成标准的高斯分布噪声

    # Generate a batch of new images
    gen_imgs = generator.predict(noise)

    # Train the discriminator
    d_loss_real = discriminator.train_on_batch(imgs, valid) #真实数据对应标签１
    d_loss_fake = discriminator.train_on_batch(gen_imgs, fake) #生成的数据对应标签０
    d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

    # ---------------------
    #  Train Generator
    # ---------------------
    noise = np.random.normal(0, 1, (batch_size, 100))

    # Train the generator (to have the discriminator label samples as valid)
    g_loss = combined.train_on_batch(noise, valid)

    # Plot the progress
    if epoch % 500==0:
        print ("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" % (epoch, d_loss[0], 100*d_loss[1], g_loss))

    # If at save interval => save generated image samples
    if epoch % sample_interval == 0:
        sample_images(epoch)

output

0 [D loss: 1.069107, acc.: 43.75%] [G loss: 0.424518]
500 [D loss: 0.858720, acc.: 45.31%] [G loss: 1.105853]
1000 [D loss: 0.807555, acc.: 46.88%] [G loss: 0.961553]
……
49000 [D loss: 0.835431, acc.: 35.94%] [G loss: 0.684724]
49500 [D loss: 0.745767, acc.: 48.44%] [G loss: 0.881849]
50000 [D loss: 0.651077, acc.: 57.81%] [G loss: 0.958367]