Caffe初学及使用过程中遇到的问题

最新推荐文章于 2019-03-28 10:40:49 发布

CaspianR

最新推荐文章于 2019-03-28 10:40:49 发布

阅读量409

点赞数

分类专栏：机器学习文章标签： Caffe 深度学习

本文链接：https://blog.youkuaiyun.com/renjunsong0/article/details/53543723

版权

机器学习专栏收录该内容

3 篇文章

订阅专栏

Caffe的安装

本文使用Ubuntu16.04环境，已经安装好了Anaconda2.7
使用apt-get安装好Caffe所需要的依赖包

sudo apt-get install protobuf-complier libprotobuf-dev

这个是由Google开发的一种实现内存与硬盘交换的协议接口，Caffe源码中大量使用ProtoBuffer传递权值和参数模型。

sudo apt-get install libhdf5-serial-dev libleveldb-dev liblmdb

HDF5为一种能高校存储和分发科学数据的新型数据格式。
LMDB和LEVELDB为内存映射型数据库管理器，在Caffe中主要作用是提供数据管理。也可以选择HDF5为数据格式。一般使用LMDB。

sudo apt-get libsnappy-dev libopencv-dev libatlas-base-dev

Snappy是一个用来压缩和解压缩的C++库。比ZIP快，但文件相对要大20%~100%
opencv就不用多说了，主要用于数据层的的一些处理
BLAS 为基本线性代数子程序，负责CPU端的数值计算（如矩阵乘法）

sudo apt-get install flags-dev libgoogle-glog-dev

GFLAGS起命令行参数解析的作用；GLOG是Google开发的用于记录应用程序日志的枯，便于查看Caffe训练中产生的中间输出，并根据这些信息决定如何调整参数来控制收敛。

sudo apt-get install --no--install-reconmmends libboost-all-dev

C++的一个常用库，Caffe中主要使用了Boost中的只能指针，其子带引用计数功能，可避免共享指针时造成的内存泄漏或多次释放。Pycaffe中使用Boost python实现c与python的连接

因为我已经安装了Anaconda所以把Makefile.config中的Python路径改了，另外，在安装过程中会出现明明已经安装了一些库却无法找到，我是用软连接ln 将库所在位置连接到/usr/lib下就好了

#Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
 ANACONDA_HOME := $(HOME)/anaconda2
 PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
        $(ANACONDA_HOME)/include/python2.7 \
        $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \

准备工作完成后就是在Caffe根目录下 make -j
然后Bingo!

个人觉得apt-get比把所有依赖包下载下来自己安装要方便=。=

数据准备

MNIST数据集

数据转换

在Caffe根目录下的examples中有一个MNIST，在该目录下运行该脚本

./get_mnist.sh

就可以获取数据集了，获取的数据集为二进制文件，需要转换为LEVELDB或者LMDB才能被Caffe识别，在根目录执行以下脚本可以将数据转换为LMDB

./examples/mnist/create_mnist.sh

然后会生成mnist_train_lmdb和mnist_test_lmdb两个目录，该脚本调用的convert_mnist_data.bin这个已经编译好的执行程序。

LeNet-5模型

对该模型的描述在lenet_train_val.protptxt文件中，具体可以自己看看

然后我们需要使用该模型对MNIST数据集进行训练，由于参数较多，还是使用脚本

#! /usr/bin/env sh
./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt

在lenet_solver.prototxt该文件中定义了训练的超参数

# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
#预测阶段迭代次数
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.训练时每迭代500次预测一次
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
#网络的基础学习速率、冲量和衰减量
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy学习速率的衰减策略
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations，每经过100次迭代，输出一个log到屏幕
display: 100
# The maximum number of iterations最大迭代次数
max_iter: 10000
# snapshot intermediate results 每5000次迭代打印一次快照，即将现在的模型参数记录下来
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU   CPU求解还是GPU
solver_mode: CPU

然后就在根目录下运行该网络,训练开始～

./examples/mnist/train_lenet.sh

然后屏幕上会出现训练的信息，这个时候就可以通过loss的变化趋势等信息来判别模型的好坏了。

训练完成后

训练完成后会多出很多文件

lenet_iter_10000.caffemodel         lenet_iter_10000.solverstate        
lenet_iter_5000.caffemodel           lenet_iter_5000.solverstate

这些就是snappy保存的模型
在训练工程中实际上是可以看到这些模型的准确率的，当然，如果你设置了的话，现在我们就可以使用这些训练好的模型来预测了

./build/tools/caffe.bin test \  #test表示只做预测（前向传播）
-model examples/mnist/lenet_train_test.prototxt  \
-weights examples/mnist/lenet_iter_10000.caffemodel \
-iterations 100

测试集也在lenet_train_test.prototxt这个文件中指明了，然后就可以得到预测结果了。

自己的数据

数据转化

想巴自己的数据拿来训练，首先需要将其转化为LMDB格式
将要使用的jpg文件放置一个文件夹中，是所有类别的图片，图片大小不做要求，Caffe数据层会自动将其转化为一定的格式，好像是256*256.
建立一个train.txt，将图片与其类别对应起来

jpg/C_1.jpg 0
jpg/C_2.jpg 0
jpg/C_3.jpg 0
jpg/B_1.jpg 1
jpg/B_2.jpg 1
jpg/B_3.jpg 1

这里有一个坑，我是用python将图片文件名与类别对应起来的，但是使用convert_imgeset时出现了can’t open such file or dir，来回折腾了好久才发现，文件名跟类别之间是一个空格，唉，当时查了半天好像没有对这个进行说明的，这个其他要求倒没发现。
然后编写一个creat_train_lmdb的脚本

DATA=examples/myexample
rm -rf $DATA/img_train_lmdb
~/caffe/build/tools/convert_imageset --shuffle \   #随机读取文件
    --resize_height=256 --resize_width=256 \        #resize图片大小
    /home/yourname/caffe/examples/myexample/ \      #你的图片文件夹所在目录
    $DATA/train.txt  $DATA/img_train_lmdb       #完整目录为/home/yourname/caffe/examples/myexample/jpg/**.jpg

然后就构造了你自己的训练集，对测试集重复该步骤。

sudo sh examples/myexample/creat_train_lmdb.sh

模型建立

如果要自己完整建立一个CNN肯定不靠谱，所以选择使用已有模型进行fine-tuning
，这里还是选择LeNet。
首先，建立自己的，lenet_solver.prototxt,关于步长等参数。
然后，编辑lenet_train_test.prototxt
在数据层修改输入的训练数据和测试数据

name: "LeNet"
layer {
  name: "test"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/myexample/img_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 5           #在这里改成你要判别的种类
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}