caffe学习笔记5 -- Alex’s CIFAR-10 tutorial, Caffe style

最新推荐文章于 2020-03-12 21:37:11 发布

原创最新推荐文章于 2020-03-12 21:37:11 发布 · 2.7k 阅读

3 ·

CC 4.0 BY-SA版权

caffe学习专栏收录该内容

26 篇文章

订阅专栏

本文介绍如何使用Caffe框架快速训练CIFAR-10数据集，并达到约75%的准确率。包括数据获取、转换为LMDB格式、计算图像均值、定义网络结构和训练过程。

这是caffe官网中Examples中的第三个例子，链接地址：http://caffe.berkeleyvision.org/gathered/examples/cifar10.html

这个例子重现了Alex Krizhevsky的cuda-convnet中的结果，具体的模型定义、参数、训练步骤等都是按照cuda-convnet中的进行设置的。

数据集描述：CIFAR-10数据集包含60000张32*32的彩色图像，包含10类，每类6000张图片，50000张训练样本图片，10000张测试样本图片。图片类型：airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck.

下面进入训练阶段

1.　获取数据：

进入caffe根目录，运行

./data/cifar10/get_cifar10.sh
./examples/cifar10/create_cifar10.sh

看一下这两个文件里写了什么内容

get_cifar10.sh ：获取数据

#!/usr/bin/env sh
# This scripts downloads the CIFAR10 (binary version) data and unzips it.

DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd $DIR

echo "Downloading..."

wget --no-check-certificate http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz #下载数据

echo "Unzipping..."

tar -xf cifar-10-binary.tar.gz && rm -f cifar-10-binary.tar.gz  #对数据解压，删除压缩包
mv cifar-10-batches-bin/* . && rm -rf cifar-10-batches-bin

# Creation is split out because leveldb sometimes causes segfault
# and needs to be re-created.

echo "Done."

create_cifar10.sh: 生成lmdb文件并计算平均值，执行该文件后，会生成两个文件

#!/usr/bin/env sh
# This script converts the cifar data into leveldb format.

EXAMPLE=examples/cifar10
DATA=data/cifar10
DBTYPE=lmdb #转换为lmdb

echo "Creating $DBTYPE..."

rm -rf $EXAMPLE/cifar10_train_$DBTYPE $EXAMPLE/cifar10_test_$DBTYPE

./build/examples/cifar10/convert_cifar_data.bin $DATA $EXAMPLE $DBTYPE  #调用convert_cifar_data.bin. 这个文件由convert_cifai_data.cpp生成，他需要三个输入值：数据存储路径，生成数据文件存放位置，生成文件格式(lmdb,或 leveldb)。

echo "Computing image mean..."

./build/tools/compute_image_mean -backend=$DBTYPE \       #调用compute_image_mean计对图像做均值处理，这个函数源于src/tools/compute_image_mean.cpp,需要三个传入参数：转换后文件的格式，转换后文件路径和生成的均值二进制文件
  $EXAMPLE/cifar10_train_$DBTYPE $EXAMPLE/mean.binaryproto  

echo "Done."

2. 使用模型：

网络结构模型

name: "CIFAR10_quick_test"
input: "data"
input_shape {
  dim: 1 # num
  dim: 3  #通道数
  dim: 32  #图像的长和宽
  dim: 32
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1 #权重w的学习率倍数
  }
  param {
    lr_mult: 2 #偏置b的学习率倍数
  }
  convolution_param {
    num_output: 32
    pad: 2 #加边为2 
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX  #Max Pooling
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE #均值池化
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
  }
}
layer {
  name: "relu3"
  type: "ReLU" # 使用ReLU激励函数，这里需要注意的是，本层的bottom和top都是conv3.
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}

solver结构

# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10

# The train/test net protocol buffer definition
net: "examples/cifar10/cifar10_quick_train_test.prototxt"   #使用网络模型
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100 #test迭代次数
# Carry out testing every 500 training iterations.
test_interval: 500 #每迭代500此测试一次
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.001 #基础学习率
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 4000
# snapshot intermediate results
snapshot: 4000 # 存储中间结果
snapshot_format: HDF5 #存储格式
snapshot_prefix: "examples/cifar10/cifar10_quick"
# solver mode: CPU or GPU
solver_mode: GPU

对于学习率的影响，在Simon Haykin的《神经网络与及其学习》中说道（中文版p86）：反向传播算法提供使用最速下降方法在权值空间计算得到的轨迹的一种近似，使用的参数越小，从一次迭代到下一次迭代的网络突触权值的变化量就越小，轨迹在权值空间就越光滑，然而，这种改进是以减慢学习速度为代价的。另一方面，如果学习率的值太大，学习速度加快，就可能使网络的权值变化量不稳定。

3. 训练

执行

./examples/cifar10/train_quick.sh

训练数据，输出形式类似与mnist训练的形式，这里不重复。

训练结果的正确率大约在75%