Opencv DNN:基于SSD实现对象检测Python实现

最新推荐文章于 2025-03-20 17:14:10 发布

一只小金毛zy

最新推荐文章于 2025-03-20 17:14:10 发布

阅读量2.2k

点赞数 11

分类专栏： opencv

本文链接：https://blog.youkuaiyun.com/qq_39071739/article/details/103728046

版权

opencv 专栏收录该内容

15 篇文章

订阅专栏

Opencv DNN:基于SSD实现对象检测Python实现

一、内容

OpenCV DNN模块支持常见得对象检测模型SSD，以及它的移动版Mobile Net-SSD，特别是后者在端侧边缘设备上可以实时计算，基于Caffe训练好的mobile-net SSD支持20类别对象检测。

训练好的模型我已经上传到百度云：
链接：https://pan.baidu.com/s/1zvIw1rkRvYqk33xwyAMjhg
提取码：n90t

使用模型实现预测的时候，需要读取图像作为输入，网络模型支持的输入数据是四维的输入，所以要把读取到的Mat对象转换为四维张量，OpenCV的提供的API为如下：
Mat cv::dnn::blobFromImage(
InputArray image,
double scalefactor = 1.0,
const Size & size = Size(),
const Scalar & mean = Scalar(),
bool swapRB = false,
bool crop = false,
int ddepth = CV_32F
)
image输入图像
scalefactor 默认1.0
size表示网络接受的数据大小
mean表示训练时数据集的均值
swapRB 是否互换Red与Blur通道
crop剪切
ddepth 数据类型

加载网络之后，推断调用的关键API如下：
Mat cv::dnn::Net::forward(
const String & outputName = String()
)
参数缺省值为空

对对象检测网络来说：
该API会返回一个四维的tensor，前两个维度是1，后面的两个维度，分别表示检测到BOX数量，以及每个BOX的坐标，对象类别，得分等信息。这里需要特别注意的是，这个坐标是浮点数的比率，不是像素值，所以必须转换为像素坐标才可以绘制BOX矩形。

二、代码

import cv2 as cv

# 模型路径
model_bin = "D:/opencv_tutorial/data/models/ssd/MobileNetSSD_deploy.caffemodel"
config_text = "D:/opencv_tutorial/data/models/ssd/MobileNetSSD_deploy.prototxt"
# 类别信息
objName = ["background",
"aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair",
"cow", "diningtable", "dog", "horse",
"motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor"]

# 加载模型
net = cv.dnn.readNetFromCaffe(config_text, model_bin)
# 读取测试图片
image = cv.imread("D:/vsprojects/images/dog.jpg")
h = image.shape[0]
w = image.shape[1]

# 获得所有层名称与索引
layerNames = net.getLayerNames()
lastLayerId = net.getLayerId(layerNames[-1])
lastLayer = net.getLayer(lastLayerId)
print(lastLayer.type)

# 检测
blobImage = cv.dnn.blobFromImage(image, 0.007843, (300, 300), (127.5, 127.5, 127.5), True, False)
net.setInput(blobImage)
cvOut = net.forward()
print(cvOut)
for detection in cvOut[0,0,:,:]:
    score = float(detection[2])
    objIndex = int(detection[1])
    if score > 0.5:
        left = detection[3]*w
        top = detection[4]*h
        right = detection[5]*w
        bottom = detection[6]*h

        # 绘制
        cv.rectangle(image, (int(left), int(top)), (int(right), int(bottom)), (255, 0, 0), thickness=2)
        cv.putText(image, "score:%.2f, %s"%(score, objName[objIndex]),
                (int(left) - 10, int(top) - 5), cv.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2, 8);

#  显示
cv.imshow('mobilenet-ssd-demo', image)
cv.imwrite("D:/Pedestrian.png", image)
cv.waitKey(0)
cv.destroyAllWindows()