DCGAN学习笔记
2019.12.10——2019.12.12
- celeba数据集读取
- GAN原理
- DCGAN架构
这是在Pytorch官网上看到的一个项目,可作为通过理解GAN的入门:DCGAN TUTORIAL
数据集读取
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torchvision.transforms import transforms
import torchvision.datasets as datasets
import torchvision.utils as vutils
dataroot = '.\celeba'
image_size = 64
dataset = datasets.ImageFolder(root=dataroot,
transform=transforms.Compose([transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5),(0.5, 0.5, 0.5)),]))
dataloader = torch.utils.data.Dataloader(dataset, batch_size=128, shuffle=True)
GAN原理简述
Generative Adversarial Network(简称GAN),是一种生成网络,由两个神经网络组成,一个是生成器网络(Generator),另一个是判别器网络(Discriminator),在DCGAN架构中,有:
- Batchnorm——generator——discriminator ——except the output of the former one and the input of the later one(normalize input to have zero mean and unit variance)
- In generator——activation—— output layer:Tanh——others:ReLU
- In discriminator——activation——output layer:Sigmoid——others:LeakyReLU
PS: nn.ConvTranspose2d(inplanes, outplanes, kernel_size, stride, padding)的补零机制需要注意下,有别于nn.Conv2d在矩阵外围直接补padding数量的零,转置卷积padding可分为内外两层,输入元素行列间:
p a d d i n g 1 = ( H i n − 1 ) ∗ ( s t r i d e − 1 ) padding_{1} = (H_{in} - 1) * (stride - 1) padding1=(Hin−1)∗(stride−1)
其中, H i n H_{in} Hin为输入维度,此处为Height的维度。
元素矩阵(both size of each dimension):
p a d d i n g 2 = d i l a t i o n ∗ ( k e r n e l _ s i z e − 1 ) − p a d d i n g padding_{2} = dilation * (kernel\_size - 1) - padding padding2=dilation∗(kernel_size−1)−padding
此处dilation均为1,可忽略。
In general,padding后按照常规卷积后矩阵维度变化可得输出维度:
H o u t = [ ( H i n − 1 ) ∗ ( s t r i d e − 1 ) + 2 ∗ ( k e r n e l _ s i z e − 1 ) − 2 ∗ p a d d i n g + H i n − k e r