Caffe学习——Imagenet分类
1. Caffe安装
参考Alten Li的Caffe安装[1]。
2. Imagenet分类
代码来自Caffe的Notebook Examples[2]。在导入Caffe前,先在sys.path中插入Caffe的路径:examples文件夹下有子文件夹pycaffe(猜是安装Caffe时执行“make pycaffe”生成的文件夹),并没有python文件夹;python文件夹属于caffe文件夹的子文件夹。所以,如果把根文件夹设为'/home/parallel/caffe/',导入caffe也是有效的。
- import numpy as np
- import matplotlib.pyplot as plt
- %matplotlib inline
- # Make sure that caffe is on the python path:
- caffe_root = '/home/parallel/caffe/examples/' # this file is expected to be in {caffe_root}/examples
- import sys
- sys.path.insert(0, caffe_root + 'python')
- import caffe
- plt.rcParams['figure.figsize'] = (10, 10)
- plt.rcParams['image.interpolation'] = 'nearest'
- plt.rcParams['image.cmap'] = 'gray'
- caffe.set_mode_cpu()
- net = caffe.Net(caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt',
- caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel',
- caffe.TEST)
- # input preprocessing: 'data' is the name of the input blob == net.inputs[0]
- transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
- transformer.set_transpose('data', (2,0,1))
- transformer.set_mean('data', np.load('/home/parallel/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1)) # mean pixel
- transformer.set_raw_scale('data', 255) # the reference model operates on images in [0,255] range instead of [0,1]
- transformer.set_channel_swap('data', (2,1,0)) # the reference model has channels in BGR order instead of RGB
- # set net to batch size of 50
- net.blobs['data'].reshape(50,3,227,227)
- net.blobs['data'].data[...] = transformer.preprocess('data', caffe.io.load_image(caffe_root + 'images/cat.jpg'))
- out = net.forward()
- print("Predicted class is #{}.".format(out['prob'][0].argmax()))
- plt.imshow(transformer.deprocess('data', net.blobs['data'].data[0]))
- # load labels
- imagenet_labels_filename = '/home/parallel/caffe/data/ilsvrc12/synset_words.txt'
- try:
- labels = np.loadtxt(imagenet_labels_filename, str, delimiter='\t')
- except:
- !../data/ilsvrc12/get_ilsvrc_aux.sh
- labels = np.loadtxt(imagenet_labels_filename, str, delimiter='\t')
- # sort top k predictions from softmax output
- top_k = net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1]
- print labels[top_k]
- # CPU mode
- net.forward() # call once for allocation
- %timeit net.forward()
- 1 loops, best of 3: 6.45 s per loop
- # GPU mode
- caffe.set_device(0)
- caffe.set_mode_gpu()
- net.forward() # call once for allocation
- %timeit net.forward()
- 1 loops, best of 3: 223 ms per loop
网络架构的绘制使用graphviz[3]。
net.blobs里包含网络每块特征地图的字典索引k和对应内容v的大小,net.params训练后的网络参数[4]。
net.params为事先设置的参数尺寸(output channels, input channels, filter height filter width)。
根据net.params中的filter size来计算net.blobs的(batch size, output channels, height, width)。
data: 50张图片为1组输入,通道数为3(RGB),图像宽度和高度为227。
conv1:(227-11)/4+1=55
pool1 :(55-3)/2+1=27
conv2:(27-5+4)/1+1=27
pool2 :(27-3)/2+1=13
conv3:(13-3+2)/1+1=13
conv4:(13-3+2)/1+1=13
conv5:13
pool5 :(13-3)/2+1=6
- [(k, v.data.shape) for k, v in net.blobs.items()]
- [('data', (50, 3, 227, 227)),
- ('conv1', (50, 96, 55, 55)),
- ('pool1', (50, 96, 27, 27)),
- ('norm1', (50, 96, 27, 27)),
- ('conv2', (50, 256, 27, 27)),
- ('pool2', (50, 256, 13, 13)),
- ('norm2', (50, 256, 13, 13)),
- ('conv3', (50, 384, 13, 13)),
- ('conv4', (50, 384, 13, 13)),
- ('conv5', (50, 256, 13, 13)),
- ('pool5', (50, 256, 6, 6)),
- ('fc6', (50, 4096)),
- ('fc7', (50, 4096)),
- ('fc8', (50, 1000)),
- ('prob', (50, 1000))]
- [(k, v[0].data.shape) for k, v in net.params.items()]
- [('conv1', (96, 3, 11, 11)),
- ('conv2', (256, 48, 5, 5)),
- ('conv3', (384, 256, 3, 3)),
- ('conv4', (384, 192, 3, 3)),
- ('conv5', (256, 192, 3, 3)),
- ('fc6', (4096, 9216)),
- ('fc7', (4096, 4096)),
- ('fc8', (1000, 4096))]
vis_square是把data归一化,然后把n个data放在同一张图片里显示出来。
params存储网络中间层的网络参数,conv1层的参数尺寸为(96, 3, 11, 11),params['conv1'][0].data为conv1的权重参数,filters转置后的尺寸为(96, 11, 11, 3),符合vis_square函数中data的定义(n, height, width, channels)。
- filters = net.params['conv1'][0].data
- vis_square(filters.transpose(0, 2, 3, 1))
conv2层的参数尺寸为(256,48,5,5),params['conv2'][0].data为conv2的权重参数,这里只显示其中的48个。注意到conv1的output channels为96,conv2的input channels为48,说明剩下的48个output channels都被丢弃了。
- filters = net.params['conv2'][0].data
- vis_square(filters[:48].reshape(48**2, 5, 5))
blobs存储网络中间层的输出数据,显示conv1层前36个特征图。根据blobs的定义,data[0, :36]应该是第1张图片的前36个output channels。
- feat = net.blobs['conv1'].data[0, :36]
- vis_square(feat, padval=1)
显示conv2层前36个特征图。
- feat = net.blobs['conv2'].data[0, :36]
- vis_square(feat, padval=1)
- feat = net.blobs['conv3'].data[0]
- vis_square(feat, padval=0.5)
- feat = net.blobs['conv4'].data[0]
- vis_square(feat, padval=0.5)
- feat = net.blobs['conv5'].data[0]
- vis_square(feat, padval=0.5)
- feat = net.blobs['pool5'].data[0]
- vis_square(feat, padval=1)
- feat = net.blobs['fc6'].data[0]
- plt.subplot(2, 1, 1)
- plt.plot(feat.flat)
- plt.subplot(2, 1, 2)
- _ = plt.hist(feat.flat[feat.flat > 0], bins=100)
- feat = net.blobs['fc7'].data[0]
- plt.subplot(2, 1, 1)
- plt.plot(feat.flat)
- plt.subplot(2, 1, 2)
- _ = plt.hist(feat.flat[feat.flat > 0], bins=100)
- feat = net.blobs['prob'].data[0]
- plt.plot(feat.flat)
3. 参考链接
from:http://blog.youkuaiyun.com/shadow_guo/article/details/50359532