caffe(一): 利用自带MNIST例程训练手写字符识别模型

最新推荐文章于 2021-07-26 13:19:01 发布

原创最新推荐文章于 2021-07-26 13:19:01 发布 · 1k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#caffe学习

deep-learning 专栏收录该内容

3 篇文章

订阅专栏

本文介绍了如何在Caffe框架下利用MNIST数据集训练手写字符识别模型。首先，通过下载、解压并转换数据集至lmdb格式。接着，定义网络结构，使用预设的lenet_train_test.prototxt文件。然后，配置solver参数，应用lenet_solver.prototxt文件。训练过程中在服务器集群上使用GPU进行，并解决权限问题。最终得到的caffemodel文件可用于手写字符识别。

因为工作的需要，近期正式开始学习深度学习，采用的深度学习框架是caffe。为了更快的了解caffe训练模型的整体流程，首先从caffe自带的MNIST例程开始。

准备数据

因为所用的linux操作系统不能联网，所以不能编写脚本来下载数据集，只能先下载下来，然后导入自己的根目录下。
下载网址：
the minist dataset
下载下来的文件如下：
这里写图片描述
train开头的文件夹代表训练集，t10k开头的文件夹代表验证集，文件命名中有images的代表图片，有labels的代表标签。
文件下载好之后，需要将其解压，linux指令如下：

$ gzip -d train-images-idx3-ubyte.gz

依次将四个文件解压，得到的文件如下：
这里写图片描述
之后，需要将其转换为caffe适用的数据格式lmdb，编写shell脚本如下：

#!/usr/bin/env sh
# this script converts the mnist data into lmdb or leveldb format,
# depending on the value assigned to $BACKEND.

EXAMPLE=examples/mnist
DATA=examples/mnist
BUILD=build/examples/mnist

BACKEND="lmdb"

echo "creating ${BACKEND}..."

rm -rf $EXAMPLE/mnist_train_${BACKEND}
rm -rf $EXAMPLE/mnist_test_${BACKEND}

$BUILD/convert_mnist_data.bin $DATA/train-images-idx3-ubyte \
 $DATA /train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
$BUILD/convert_mnist_data.bin $DATA/t10k-images-idx3-ubyte \
 $DATA/t10k-labels-idx1-ubyte $EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}
echo "done"

EXAMPLE是lmdb文件存放路径，DATA是原始数据存放路径，BUILD是convert_mnist_data.bin文件的存放路径，将其换成自己的正确的路径，convert_mnist_data.bin是caffe自带的一个数据类型转换的文件。
至此，数据准备完毕。

定义网络结构

caffe中已经自带了该任务的网络结构，即：lenet_train_test.prototxt文件，看一下其中的内容：

name: "LeNet"
layer{
    name: "mnist"
    type: "Data"
    top: "data"
    top: "label"
    include {
        phase: TRAIN
    }
    transform_param {
        scale: 0.00390625
    }
    data_param {
        source: "examples/mnist/mnist_train_lmdb"
        batch_size: 64
        backend: LMDB
    }
}
layer {
    name: "mnist"
    type: "Data"
    top: "data"
    top: "label"
    include {
        phase: TEST
    }
    transform_param {
        scale: 0.00390625
    }
    data_param {
        source: "examples/mnist/mnist_test_lmdb"
        batch_size: 100
        backend: LMDB
    }
}
layer {
    name: "conv1"
    type: "Convolution"
    bottom: "data"
    top: "conv1"
    param {
        lr_mult: 1
    }
    param {
        lr_mult: 2
    }
    convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
            type: "xavier"
        }
        bias_filler {
            type: "constant"
        }
    }
}
layer {
    name: "pool1"
    type: "Pooling"
    bottom: "conv1"
    top: "pool1"
    pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
    }
}
layer {
    name: "conv2"
    type: "Convolution"
    bottom: "pool1"
    top: "conv2"
    param {
        lr_mult: 1
    }
    param {
        lr_mult: 2
    }
    convolution_param {
        num_output: 50
        kernel_size: 5
        stride: 1
        weight_filler {
            type: "xavier"
        }
        bias_filler {
            type: "xavier"
        }
    }
}
layer {
    name: "pool2"
    type: "Pooling"
    bottom: "conv2"
    top: "pool2"
    pooling_param {
        pool: MAX
        kernel_size: 2
        stride:2 
    }
}
layer {
    name: "ip1"
    type: "InnerProduct"
    bottom: "pool2"
    top: "ip1"
    param {
        lr_mult: 1
    }
    param {
        lr_mult: 2
    }
    inner_product_param {
        num_output: 500
        weight_filler {
            type: "xavier"
        }
        bias_filler {
            type: "constant"
        }
    }
}
layer {
    name: "relu1"
    type: "ReLU"
    bottom: "ip1"
    top: "ip1"
}
layer {
    name: "ip2"
    type: "InnerProduct"
    bottom: "ip1"
    top: "ip2"
    param {
        lr_mult: 1
    }
    param {
        lr_mult: 2
    }
    inner_product_param {
        num_output: 10
        weight_filler {
            type: "xavier"
        }
        bias_filler {
            type: "constant"
        }
    }
}
layer {
    name: "accuracy"
    type: "Accuracy"
    bottom: "ip2"
    bottom: "label"
    top: "accuracy"
    include {
        phase: TEST
    }
}
layer {
    name: "loss"
    type: "SoftmaxWithLoss"
    bottom: "ip2"
    bottom: "label"
    top: "loss"
}

配置solver参数

caffe已经自带啦该任务的sovler文件，即：lenet_solver.prototxt, 看一下它的内容：

# the train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# in the case of MNIST, we have test batch size 100 and 100 test iterations,
# convering the full 10000 testing images.
test_iter:100
# carry out testing every 500 training iterations.
test_interval: 500
test_type: TEST
# the base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# the learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# display every 100 iterations
display: 100
# the maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: GPU

训练

因为我是在服务器集群上跑的代码，使用了GPU，在caffe根目录下运行脚本文件，指令如下：

$ srun -p K15G12 -J MNIST -c 4 --gres=gpu:1 sh examples/mnist/train_lenet.sh

train_lenet.sh脚本内容如下：

#!/usr/bin/env sh

./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt

最后，在./examples/mnist文件夹下，出现了如下几个文件：
这里写图片描述
caffemodel就是训练好了的用于手写字符识别的模型。

注意

因为服务器上caffe已经配置好了，所以我是直接从别人那里拷过来一个caffe包，运行的时候一直出错。之后发现这是因为./build/tools/caffe 这个软链接到别人的caffe，我没有权限，我需要重新编译链接一下。
所以我在caffe根目录下首先 make clean，删掉原先存在的build文件夹，然后 make，重新编译链接，之后就能正常运行啦。