使用CIFAR10数据集,用三种框架构建Residual_Network作为例子,比较框架间的异同。
文章目录
数据集格式
pytorch的数据集格式
import torch
import torch.nn as nn
import torchvision
# Download and construct CIFAR-10 dataset.
train_dataset = torchvision.datasets.CIFAR10(root='../../data/',
train=True,
download=True)
# Fetch one data pair (read data from disk).
image, label = train_dataset[0]
print (image.size()) # torch.Size([3, 32, 32])
print (label) # 6
print (train_dataset.data.shape) # (50000, 32, 32, 3)
# type(train_dataset.targets)==list
print (len(train_dataset.targets)) # 50000
# Data loader (this provides queues and threads in a very simple way).
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=64,
shuffle=True)
"""
# 演示DataLoader返回的数据结构
# When iteration starts, queue and thread start to load data from files.
data_iter = iter(train_loader)
# Mini-batch images and labels.
images, labels = data_iter.next()
print(images.shape) # torch.Size([100, 3, 32, 32])
print(labels.shape)
# torch.Size([100]) 可见经过DataLoader后,labels由list变成了pytorch内置的tensor格式
"""
# 一般使用的话是下面这种
# Actual usage of the data loader is as below.
for images, labels in train_loader:
# Training code should be written here.
pass
keras的数据格式
import keras
from keras.datasets import cifar10
(train_x, train_y) , (test_x, test_y) = cifar10.load_data()
print(train_x.shape) # ndarray 类型: (50000, 32, 32, 3)
print(train_y.shape) # (50000, 1)
输入网络的数据格式不同
"""
1: pytorch 都是内置torch.xxTensor输入网络,而keras的则是原生ndarray类型
2: 对于multi-class的其中一种loss,即cross-entropy loss 而言,
pytorch的api为 CorssEntropyLoss, 但y_true不能用one-hoe编码!这与keras,tensorflow 都不同。tensorflow相应的api为softmax_cross_entropy
他们的api都仅限于multi-class classification
3*: 其实上面提到的api都属于categorical cross-entropy loss,
又叫 softmax loss,是函数内部先进行了 softmax 激活,再经过cross-entropy loss。
这个loss是cross-entropy loss的变种,
cross-entropy loss又叫logistic loss 或 multinomial logistic loss。
实现这种loss的函数不包括激活函数,需要自定义。
pytorch对应的api为BCEloss(仅限于 binary classification),
tensorflow 对应的api为 log_loss。
cross-entropy loss的第二个变种是 binary cross-entropy loss 又叫 sigmoid cross- entropy loss。
函数内部先进行了sigmoid激活,再经过cross-entropy loss。
pytorch对应的api为BCEWithLogitsLoss,
tensorflow对应的api为sigmoid_cross_entropy
"""
# pytorch
criterion = nn.CrossEntropyLoss()
...
for epoch in range(num_epochs):
for i, (images, labels) in enumerate(train_loader):
images = images.to(device)
labels = labels.to(device)
# Forward pass
outputs = model(images)
# 对于multi-class cross-entropy loss
# 输入y_true不需要one-hot编码
loss = criterion(outputs, labels)
...
# keras
# 对于multi-class cross-entropy loss
# 输入y_true需要one-hot编码
train_y = keras.utils.to_categorical(train_y,10)
...
model.fit_generator(datagen.flow(train_x, train_y, batch_size=128),
validation_data=[test_x,test_y],
epochs=epochs,steps_per_epoch=steps_per_epoch, verbose=1)
...
整体流程
keras 流程
model = myModel()
model.compile(optimizer=Adam(0.001),loss="categorical_crossentropy",metrics=["accuracy"])
model.fit_generator(datagen.flow(train_x, train_y, batch_size=128),
validation_data=[test_x,test_y],
epochs=epochs,steps_per_epoch=steps_per_epoch, verbose=1, workers=4)
#Evaluate the accuracy of the test dataset
accuracy = model.evaluate(x=test_x,y=test_y,batch_size=128)
# 保存整个网络
model.save("cifar10model.h5")
"""
# https://blog.youkuaiyun.com/jiandanjinxin/article/details/77152530
# 使用
# keras.models.load_model("cifar10model.h5")
# 只保存architecture
# json_string = model.to_json()
# open('my_model_architecture.json','w').write(json_string)
# 使用
# from keras.models import model_from_json
#model = model_from_json(open('my_model_architecture.json').read())
# 只保存weights
# model.save_weights('my_model_weights.h5')
#需要在代码中初始化一个完全相同的模型
# model.load_weights('my_model_weights.h5')
#需要加载权重到不同的网络结构(有些层一样)中,例如fine-tune或transfer-learning,可以通过层名字来加载模型
# model.load_weights('my_model_weights.h5', by_name=True)
"""
pytorch 流程
model = myModel()
# Loss and optimizer
criterion = nn.CrossEntropyLoss()
for epoch in range(num_epochs):
for i, (images, labels) in enumerate(train_loader):
images = images.to(device)
labels = labels.to(device)
# Forward pass
outputs = model(images)
loss = criterion(outputs, labels)
# Backward and optimize
# 将上次迭代计算的梯度值清0
optimizer.zero_grad()
# 反向传播,计算梯度值
loss.backward()
# 更新权值参数
optimizer.step()
# model.eval(),让model变成测试模式,对dropout和batch normalization的操作在训练和测试的时候是不一样的
# eval()时,pytorch会自动把BN和DropOut固定住,不会取平均,而是用训练好的值。
# 不然的话,一旦test的batch_size过小,很容易就会被BN层导致生成图片颜色失真极大。