fine-tuning流程:
1、准备数据集(包括训练、验证、测试);
2、数据转换和数据集的均值文件生成;
3、修改网络输出类别和最后一层的网络名称,加大最后一层参数的学习速率,调整solver的配置参数;
4、加载预训练模型的参数,启动训练;
5、选取图片进行测试。
准备数据集
将图像整理到对应的文件夹中,对应的ground-truth放到对应的txt文件中。把自己的数据集划分为训练集、验证集和测试集三个集合,并把对应的图片放到对应的文件夹下。然后生成三个txt文件来保存三个集合的图片以及ground-truth。如下:(本人做单字符识别,因此对应的类别为数字0-9)
0000000.jpg 0
0000035.jpg 7
0000054.jpg 1
0000071.jpg 0
0000074.jpg 1
0000080.jpg 0
0000083.jpg 0
0000090.jpg 0
0000100.jpg 0
0000103.jpg 0
0000161.jpg 0
0000173.jpg 0
0000195.jpg 3
0000210.jpg 0
0000221.jpg 0
0000231.jpg 0
0000252.jpg 0
0000283.jpg 4
每行包含两项:图片名称以及对应的类别,中间以空格分隔。
划分数据集的代码如下:
# -*- coding: utf-8 -*-
__author__ = 'XYZ'
import os
from os import listdir
from os.path import isfile, join
from PIL import Image
import xml.dom.minidom
import shutil
import random
from random import choice
DataPath = "E:\\XYZ\\digital number recognization\\watermeter_data\\singleCharacters\\"
fileListPath = "E:\\XYZ\\digital number recognization\\watermeter_data\\SingleCharactersFileList\\"
DataSavedPath = 'E:\\XYZ\\digital number recognization\\watermeter_data\\CharactersSavedPath\\'
trainval_percent = 0.8
train_percent = 0.8
labels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
if not os.path.exists(fileListPath):
os.makedirs(fileListPath)
if not os.path.exists(DataSavedPath):
os.makedirs(DataSavedPath)
file_list = os.listdir(DataPath)
names = []
for tmp in file_list:
try:
name = tmp.split(".")[0]
names.append(name)
except Exception as e:
print("Error:", e)
totalSize = len(names)
trainval_names = random.sample(names, int(trainval_percent*totalSize))
train_names = random.sample(trainval_names, int(train_percent*len(trainval_names)))
test_names = []
for tmp in names:
if tmp not in trainval_names:
test_names.append(tmp)
valid_names = []
for tmp in trainval_names:
if tmp not in train_names:
valid_names.append(tmp)
train_file_path = fileListPath + "train.txt"
valid_file_path = fileListPath + "valid.txt"
test_file_path = fileListPath + "test.txt"
train_file = open(train_file_path,'w')
for tmp in train_names:
name = tmp.split("_")[0]
label = tmp.split("_")[1]
train_file.write(name+'.jpg '+label+'\n')
img_path = DataPath+tmp+".jpg"
new_path = DataSavedPath+name+".jpg"
os.rename(img_path,new_path)
train_file.close()
test_file = open(test_file_path,'w')
for tmp in test_names:
name = tmp.split("_")[0]
label = tmp.split("_")[1]
test_file.write(name+'.jpg '+label+'\n')
img_path = DataPath+tmp+".jpg"
new_path = DataSavedPath+name+".jpg"
os.rename(img_path,new_path)
test_file.close()
valid_file = open(valid_file_path,'w')
for tmp in valid_names:
name = tmp.split("_")[0]
label = tmp.split("_")[1]
valid_file.write(name+'.jpg '+label+'\n')
img_path = DataPath+tmp+".jpg"
new_path = DataSavedPath+name+".jpg"
os.rename(img_path,new_path)
valid_file.close()
最后得到的数据的目录结构如下:
数据转换和数据集的均值文件生成
该步骤可以利用caffe自带的例子中的脚本完成数据转换。
在caffe-root/examples/下创建自己的文件夹watermeter
复制caffe-root/examples/imagenet/create_imagenet.sh文件到caffe-root/examples/watermeter下,重命名为create_watermeter.sh。修改内容如下:
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e
EXAMPLE=examples/watermeter # lmdb saved path
DATA=~/DataDir/watermeter_characters/ # image path
TOOLS=build/tools
TRAIN_DATA_ROOT=~/DataDir/watermeter_characters/train/
VAL_DATA_ROOT=~/DataDir/watermeter_characters/valid/
TEST_DATA_ROOT=~/DataDir/watermeter_characters/test/
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
RESIZE_HEIGHT=256
RESIZE_WIDTH=256
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$VAL_DATA_ROOT" ]; then
echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet validation data is stored."
exit 1
fi
echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TRAIN_DATA_ROOT \
$DATA/train.txt \
$EXAMPLE/watermeter_train_lmdb #changed
echo "Creating val lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$VAL_DATA_ROOT \
$DATA/valid.txt \
$EXAMPLE/watermeter_val_lmdb
echo "Creating test lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TEST_DATA_ROOT \
$DATA/test.txt \
$EXAMPLE/watermeter_test_lmdb
echo "Done."
从上到下各变量代表的意思依次是:EXAMPLE指定转换后的lmdb数据存放的路径,DATA指定原生数据所在目录,TOOLS指定实际进行数据转换时所用到的文件所在的目录,即build/tools。在caffe的根目录下执行该脚本文件。三个…DATA_ROOT变量分别代表训练集、验证集和测试集所在目录。也就是说,在第一步准备数据之后,数据集和对应的标注文件。
由于后续需要计算图像的平均值,所以要将所有的图片resize一下, 将RESIZE变量设为true即可。
根目录下执行命令:
./examples/watermeter/create_watermeter.sh
输出结果:
执行完脚本后,在EXAMPLE文件夹下生成如下三个文件夹:
接着生成均值文件,因为机器学习算法一半都会对数据做去均值化处理,该均值文件会在网络训练时用到。同样将caffe-root/examples/imagenet/make_imagenet_mean.sh复制到caffe-root/examples/watermeter文件夹下,并重命名为make_watermeter_mean.sh,其内容如下:
#!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12
EXAMPLE=~/CaffeDir/caffe/examples/watermeter
DATA=~/DataDir/watermeter_characters/
TOOLS=build/tools
$TOOLS/compute_image_mean $EXAMPLE/watermeter_train_lmdb \
$DATA/watermeter_mean.binaryproto
echo "Done."
根目录下执行:
./examples/watermeter/make_watermeter_mean.sh
输出如下:
DATA指明均值文件的名称和存放路径。在caffe根目录下运行该脚本,最终得到”_mean.binaryproto”文件。
复制watermeter_mean.binaryproto到caffe-root/examples/watermeter目录下。
修改网络
使用caffe做fine-tuning,本文以caffenet为例。caffe中,网络结构最终是以.prototxt(文件后缀)文件来定义的,可以通过写代码来定义网络,不过最后还是要生成一个.prototxt文件来执行。
在caffe-root/models目录下创建character_classification文件夹,将caffe-root/models/bvlc_reference_caffenet下的deploy.prototxt、solver.prototxt、train_val.prototxt三个文件复制到character_classification文件夹下,并做修改。
deploy.prototxt修改如下:
name: "CaffeNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
engine: CAFFE
num_output: 96
kernel_size: 11
stride: 4
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
engine: CAFFE
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
engine: CAFFE
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
engine: CAFFE
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
engine: CAFFE
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8-watermeter" #最后全连接层名称,与train_val.prototxt对应
type: "InnerProduct"
bottom: "fc7"
top: "fc8-watermeter"
inner_product_param {
num_output: 10 #修改分类类别数目
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc8-watermeter" #最后全连接层名称,与train_val.prototxt对应
top: "prob"
}
train_val.prototxt:
name: "CaffeNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "/home/wupengfei/DataDir/watermeter_characters/watermeter_mean.binaryproto" #上步生成的图像均值文件目录,注意要写全路径
}
data_param {
source: "examples/watermeter/watermeter_train_lmdb" # 训练数据的lmdb文件夹
batch_size: 128
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 227
mean_file: "/home/wupengfei/DataDir/watermeter_characters/watermeter_mean.binaryproto" #均值文件全路径
}
data_param {
source: "examples/watermeter/watermeter_test_lmdb" #测试数据lmdb路径
batch_size: 64
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
engine: CAFFE
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
engine: CAFFE
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
engine: CAFFE
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
engine: CAFFE
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
engine: CAFFE
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8-watermeter" # fine-tuning该层,因此要重命名,否则会报错
type: "InnerProduct"
bottom: "fc7"
top: "fc8-watermeter" #重命名
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 0
}
inner_product_param {
num_output: 10 #类别数目
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8-watermeter" #重命名层
bottom: "label"
top: "loss"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8-watermeter" # 重命名层
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
solver.prototxt:
net: "models/character_classification/train_val.prototxt" #模型结构路径
test_iter: 100
test_interval: 1000
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 5000
display: 20
max_iter: 3000 #最大迭代次数
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "models/character_classification/caffenet_watermeter_train" #生成的模型参数存储路径
solver_mode: GPU
训练
./build/tools/caffe train -solver models/character_classification/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0
选取图片进行测试
mean.binaryproto 转化
import caffe
import numpy as np
MEAN_PROTO_PATH = 'mean.binaryproto' # 待转换的pb格式图像均值文件路径
MEAN_NPY_PATH = 'mean.npy' # 转换后的numpy格式图像均值文件路径
blob = caffe.proto.caffe_pb2.BlobProto() # 创建protobuf blob
data = open(MEAN_PROTO_PATH, 'rb' ).read() # 读入mean.binaryproto文件内容
blob.ParseFromString(data) # 解析文件内容到blob
array = np.array(caffe.io.blobproto_to_array(blob))# 将blob中的均值转换成numpy格式,array的shape (mean_number,channel, hight, width)
mean_npy = array[0] # 一个array中可以有多组均值存在,故需要通过下标选择其中一组均值
np.save(MEAN_NPY_PATH ,mean_npy)
测试程序如下:
import numpy as np
import matplotlib.pyplot as plt
import os
import sys
import caffe
import time
import cv2
caffe_root = '/home/wupengfei/CaffeDir/caffe/'
sys.path.insert(0,caffe_root+'python')
MODEL_FILE = caffe_root+'models/character_classification/deploy.prototxt'
caffemodel = caffe_root+'models/character_classification/caffenet_watermeter_train_iter_3000.caffemodel'
synset_words = caffe_root + 'data/watermeter_test/words.txt'
labels = np.loadtxt(synset_words, str, delimiter='\t')
caffe.set_mode_gpu()
net = caffe.Net(MODEL_FILE, caffemodel, caffe.TEST)
mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', mu)
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
img_root = caffe_root + 'data/watermeter_test/'
#img = img_root + '0001233.jpg'#0001233
images = os.listdir(img_root)
for img in images:
if img.split('.')[-1] == 'jpg':
img_path = img_root+img
input_image = caffe.io.load_image(img_path)
net.blobs['data'].data[...] = transformer.preprocess('data',input_image)
out = net.forward()
prob = net.blobs['prob'].data[0].flatten()
top_k = net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1]
print(img," class:",labels[top_k[0]],prob[top_k[0]])
错误和解决方案
1、Check failed: error == cudaSuccess (2 vs. 0) out of memory】
修改batch_size
2、Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
在卷积层convolution_param中添加engine: CAFFE
3、"Incorrect data field size"
在生成均值文件时可能遇到该错误,因此在进行数据转换时RESIZE要设置成true.
4、libcaffe.so.1.0.0 symbol cudnnSetActivationDescriptor, version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference
重新安装cudnn:http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
5. Unknown database backend
解决方案:setting OPENCV, LMDB flag back to 1 and recompiling
修改Makefile.config文件:
USE_OPENCV := 1
USE_LEVELDB := 1
USE_LMDB := 1
重新编译Caffe:
make clean
make all
make test
make runtest
make pycaffe
6. make pycaffe出现错误:
python/caffe/_caffe.hpp:8:31: fatal error: numpy/arrayobject.h: No such file or directory
You may need to first relocate the file numpy/arrayobject.h on your computer using "find / -name numpy/arrayobject.h", and then modify the PYTHON_INCLUDE in the Makefile.configure.
Perhaps it's in /usr/local/lib/python2.7 instead of /usr/lib/python2.7
7. python can't import _caffe module
Make sure you have done
make pycaffe
8. I1220 14:47:21.014974 402 solver.cpp:449] Snapshotting to binary proto file ./snapshots/split_iter_2500.caffemodel
F1220 14:47:23.816285 402 io.cpp:67] Check failed: proto.SerializeToOstream(&output)
check if you have any more space on disk.
参考:
[1] http://blog.youkuaiyun.com/u010358677/article/details/53305333
[2] http://blog.youkuaiyun.com/sinat_26917383/article/details/54141697
[3] https://github.com/BVLC/caffe/issues/3579
[4] https://github.com/BVLC/caffe/issues/1284
[5] https://github.com/BVLC/caffe/issues/263