这篇文章讲的是,如何使用已经训练好的模型,来解决实际问题,如果不知道怎么训练数据,请看Caffe安装与调试系列2--训练自己的数据
我们在之前的工作中,已经训练好了一个模型,今天就用这个模型来检测一张图片和一段视频,看看效果
我先录制一段视频,用手机就行,放到caffe/examples/videos/RMinfantry_videos/video1.mp4中,作为一会儿检测的视频,图片就用当时训练数据时的就行,不再单独拍照了
一、python脚本编写
在Caffe安装与调试系列1--安装caffe 中,我们在文章的最后尝试使用摄像头进行目标检测,但是ssd作者给的脚本太长,而且我们没法得到一些中间数据,在这里,我们就自己写一个脚本去运行,代码如下:
import os
import caffe
import sys
import cv2
caffe_root = os.getcwd()
deploy = 'models/VGGNet/VOC0712/SSD_RMinfantry_300x300/deploy.prototxt'
weight = 'models/VGGNet/VOC0712/SSD_RMinfantry_300x300/VGG_SSD_RMinfantry_300x300_iter_20000.caffemodel'
video_full_path = '/home/kangyi/caffe/examples/videos/RMinfantry_videos/video1.mp4'
testImg = '/home/kangyi/data/RMinfantry/JPEGImages/1.jpg'
image_size = 300
def net_init():
sys.path.insert(0, caffe_root + 'python')
caffe.set_mode_gpu()
caffe.set_device(0)
net = caffe.Net(deploy, weight, caffe.TEST)
return net
def testImage(image, net):
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
# changing blob from H*W*C to C*H*W
transformer.set_transpose('data', (2, 0, 1))
# ensure the pixel scale is range from (0,255)
transformer.set_raw_scale('data', 255)
# change channel order from RGB to BGR
transformer.set_channel_swap('data', (2, 1, 0))
# reshape data
net.blobs['data'].reshape(1, 3, image_size, image_size)
# input data and preprocess
#if image is string, the image is a picture, or it's a matrix from a video
if isinstance(image, str):
net.blobs['data'].data[...] = transformer.preprocess('data', caffe.io.load_image(image))
else:
net.blobs['data'].data[...] = transformer.preprocess('data', caffe.io.load_image_from_frame(image))
# testing model is just a forward process
out = net.forward()
xmin = out['detection_out'][0][0][0][3]
ymin = out['detection_out'][0][0][0][4]
xmax = out['detection_out'][0][0][0][5]
ymax = out['detection_out'][0][0][0][6]
xcenter = (xmin + xmax) / 2
ycenter = (ymin + ymax) / 2
print(out)
print(xcenter)
print(ycenter)
print('-------------------------------------------')
if __name__ == "__main__":
net = net_init()
cap = cv2.VideoCapture(video_full_path)
if(cap.isOpened()):
print('Read the video successfully')
frame_count = 1
success = True
while (success):
success, frame = cap.read() # frame is a BGR image
frameRGB = frame[:, :, (2, 1, 0)]
print('-----------------------------------Read a new frame: ' + str(success) + ' ' + str(frame_count))
testImage(frameRGB, net) # the input needs a RGB image
cv2.imshow('video', frame)
frame_count = frame_count + 1
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
else:
print('Can not read the video')
'''
if __name__ == "__main__":
net = net_init()
testImage(testImg, net) # the input needs a RGB image
'''
在这个脚本中,我们既可以检测图片也可以检测视频,脚本很短很简单,我也就不多说什么了
我就简单说一下这个脚本是怎么写的吧
这个要十分感谢caffe框架,caffe框架预留了python接口和matlab接口,我在这里就是使用了caffe的python接口,而相关的python包,就写在caffe/python中了,这也就是为什么我们在安装caffe的时候要把home/(服务器名称)/caffe/python导入到python路径中了
在代码中,有一个地方要注意,就是opencv读到的图片和视频什么的,都是BGR顺序的,这个和caffe中的顺序是一致的,但是用python读到的却是RGB顺序,这个也就是为什么我要在testImg()函数中进行通道顺序的变换了
在代码中,有这么一个函数load_image_from_frame(),这个函数是我自己写的,因为我发现caffe预留的python接口中没有读取视频流的接口,这个就直接写在caffe/python/caffe/io.py文件中了,代码如下:
def load_image_from_frame(frame):
"""
Load a frame from a video or camera
Parameters
----------
frame : matrix, will be from a video or camera
it must be R G B in order
Returns
-------
image : an image with type np.float32 in range [0, 1]
of size (H x W x 3) in RGB or
of size (H x W x 1) in grayscale.
"""
img = skimage.img_as_float(np.array(frame)).astype(np.float32)
if img.ndim == 2:
img = img[:, :, np.newaxis]
if color:
img = np.tile(img, (1, 1, 3))
elif img.shape[2] == 4:
img = img[:, :, :3]
return img
二、运行脚本
打开终端
cd /home/(服务器名称)/caffe
python 上面脚本的路径
三、补充说明
如果想看载入的视频,可以把cv2.imshow()那句话给打开,不过我并没有在视频中圈出目标的位置,这一步放到以后做吧