最近在玩这个东西,按照网上的教程一步步的完成网络训练,但是却不知道该怎么使用。百度了一下还是没有确实连接的样子。可能这种问题实在太低级了哈。几乎可以说在官网上面有详细的说明。只是刚开始接触这个没有仔细看完吧。
首先关于适用caffe image-net如何训练的官方教程在:
http://caffe.berkeleyvision.org/gathered/examples/imagenet.html
认真看完,在该网站的最后有一个连接,介绍了如何使用训练好的模型(python版本)。
1 设置
1.1 设置python,numpy,matplotlib
# set up Python environment: numpy for numerical routines, and matplotlib for plotting
import numpy as np
import matplotlib.pyplot as plt
# display plots in this notebook
%matplotlib inline
# set display defaults
plt.rcParams['figure.figsize'] = (10, 10) # large images
plt.rcParams['image.interpolation'] = 'nearest' # don't interpolate: show square pixels
plt.rcParams['image.cmap'] = 'gray' # use grayscale output rather than a (potentially misleading) color heatmap
1.2 导入 caffe
<span style="font-size:14px;"># The caffe module needs to be on the Python path;
# we'll add it here explicitly.
import sys
caffe_root = '../' # this file should be run from {caffe_root}/examples (otherwise change this line)
sys.path.insert(0, caffe_root + 'python')
import caffe
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.</span>
1.3 其它
如果需要的话可以下载参考模型("CaffeNet", AlexNet的一个变体)
import os
if os.path.isfile(caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'):
print 'CaffeNet found.'
else:
print 'Downloading pre-trained CaffeNet model...'
!../scripts/download_model_binary.py ../models/bvlc_reference_caffenet
2 导入网络以及设置输入预处理
2.1 设置caffe为CPU模式,并导入网络模型
caffe.set_mode_cpu()
model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'
model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
net = caffe.Net(model_def, # defines the structure of the model
model_weights, # contains the trained weights
caffe.TEST) # use test mode (e.g., don't perform dropout)
2.2 设置输入预处理
官网使用的是caffe的caffe.io.Transformer来做这个事情。但是这个步骤与caffe的其他部分是独立的,所以这里使用别的方式来做预处理也可以。
官方的CaffeNet默认配置的是BGR格式图片。Values are expected to start in the range [0, 255] and then have the mean ImageNet pixel value subtracted from them. 另外通道维(the channel dimension)通常作为第一维。
然而,matplotlib在载入图像时,RGB格式的取值范围是[0,1],且通道维作为最后一维。所以这里需要做转换。
# load the mean ImageNet image (as distributed with Caffe) for subtraction
mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1) # average over pixels to obtain the mean (BGR) pixel values
print 'mean-subtracted values:', zip('BGR', mu)
# create transformer for the input called 'data'
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension
transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR
mean-subtracted values: [('B', 104.0069879317889), ('G', 116.66876761696767), ('R', 122.6789143406786)]
3 CPU分类
现在已经准备好执行分类了。即使只对一张图像进行分类,同样设置一个50的批量尺寸展示其批处理。
# set the size of the input (we can skip this if we're happy
# with the default; we can also change it later, e.g., for different batch sizes)
net.blobs['data'].reshape(50, # batch size
3, # 3-channel (BGR) images
227, 227) # image size is 227x227
载入一个图片,并执行之前设置的预处理
image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg')
transformed_image = transformer.preprocess('data', image)
plt.imshow(image)
输出:
<matplotlib.image.AxesImage at 0x7f09693a8c90>
对其进行分类
# copy the image data into the memory allocated for the net
net.blobs['data'].data[...] = transformed_image
### perform classification
output = net.forward()
output_prob = output['prob'][0] # the output probability vector for the first image in the batch
print 'predicted class is:', output_prob.argmax()
输出:
predicted class is: 281这个网络给了我们一个概率向量;概率最高的就是第281类。是否正确呢?检测ImageNet标签
# load ImageNet labels
labels_file = caffe_root + 'data/ilsvrc12/synset_words.txt'
if not os.path.exists(labels_file):
!../data/ilsvrc12/get_ilsvrc_aux.sh
labels = np.loadtxt(labels_file, str, delimiter='\t')
print 'output label:', labels[output_prob.argmax()]
输出:
output label: n02123045 tabby, tabby cat说明正确了。但是想看看其它几个概率很大的分类呢?
# sort top five predictions from softmax output
top_inds = output_prob.argsort()[::-1][:5] # reverse sort and take five largest items
print 'probabilities and labels:'
zip(output_prob[top_inds], labels[top_inds])
输出:
4 转换到GPU模式
看看这个分类花了多久,和GPU模式做一个比较
%timeit net.forward()
1 loop, best of 3: 1.42 s per loop即使是50个图的批处理,也只是一会的功夫。那么GPU模式呢?
caffe.set_device(0) # if we have multiple GPUs, pick the first one
caffe.set_mode_gpu()
net.forward() # run once before timing to set up memory
%timeit net.forward()
10 loops, best of 3: 70.2 ms per loop