Opencv DNN:基于SSD实现对象检测Python实现
一、内容
OpenCV DNN模块支持常见得对象检测模型SSD, 以及它的移动版Mobile Net-SSD,特别是后者在端侧边缘设备上可以实时计算,基于Caffe训练好的mobile-net SSD支持20类别对象检测。
训练好的模型我已经上传到百度云:
链接:https://pan.baidu.com/s/1zvIw1rkRvYqk33xwyAMjhg
提取码:n90t
使用模型实现预测的时候,需要读取图像作为输入,网络模型支持的输入数据是四维的输入,所以要把读取到的Mat对象转换为四维张量,OpenCV的提供的API为如下:
Mat cv::dnn::blobFromImage(
InputArray image,
double scalefactor = 1.0,
const Size & size = Size(),
const Scalar & mean = Scalar(),
bool swapRB = false,
bool crop = false,
int ddepth = CV_32F
)
image输入图像
scalefactor 默认1.0
size表示网络接受的数据大小
mean表示训练时数据集的均值
swapRB 是否互换Red与Blur通道
crop剪切
ddepth 数据类型
加载网络之后,推断调用的关键API如下:
Mat cv::dnn::Net::forward(
const String & outputName = String()
)
参数缺省值为空
对对象检测网络来说:
该API会返回一个四维的tensor,前两个维度是1,后面的两个维度,分别表示检测到BOX数量,以及每个BOX的坐标,对象类别,得分等信息。这里需要特别注意的是,这个坐标是浮点数的比率,不是像素值,所以必须转换为像素坐标才可以绘制BOX矩形。
二、代码
import cv2 as cv
# 模型路径
model_bin = "D:/opencv_tutorial/data/models/ssd/MobileNetSSD_deploy.caffemodel"
config_text = "D:/opencv_tutorial/data/models/ssd/MobileNetSSD_deploy.prototxt"
# 类别信息
objName = ["background",
"aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair",
"cow", "diningtable", "dog", "horse",
"motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor"]
# 加载模型
net = cv.dnn.readNetFromCaffe(config_text, model_bin)
# 读取测试图片
image = cv.imread("D:/vsprojects/images/dog.jpg")
h = image.shape[0]
w = image.shape[1]
# 获得所有层名称与索引
layerNames = net.getLayerNames()
lastLayerId = net.getLayerId(layerNames[-1])
lastLayer = net.getLayer(lastLayerId)
print(lastLayer.type)
# 检测
blobImage = cv.dnn.blobFromImage(image, 0.007843, (300, 300), (127.5, 127.5, 127.5), True, False)
net.setInput(blobImage)
cvOut = net.forward()
print(cvOut)
for detection in cvOut[0,0,:,:]:
score = float(detection[2])
objIndex = int(detection[1])
if score > 0.5:
left = detection[3]*w
top = detection[4]*h
right = detection[5]*w
bottom = detection[6]*h
# 绘制
cv.rectangle(image, (int(left), int(top)), (int(right), int(bottom)), (255, 0, 0), thickness=2)
cv.putText(image, "score:%.2f, %s"%(score, objName[objIndex]),
(int(left) - 10, int(top) - 5), cv.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2, 8);
# 显示
cv.imshow('mobilenet-ssd-demo', image)
cv.imwrite("D:/Pedestrian.png", image)
cv.waitKey(0)
cv.destroyAllWindows()