YOLO之发布检测结果

最新推荐文章于 2025-02-22 00:08:12 发布

原创最新推荐文章于 2025-02-22 00:08:12 发布 · 373 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO

部署运行你感兴趣的模型镜像

如何在ROS2环境中利用YOLOv5模型进行视觉检测，包括加载模型、提取边界框、置信度和类别，以及发布检测结果和带有识别结果的图像。

首先要完善的是结果发布函数 pub_result，我们需要从检测结果中提取到边界框、置信度和类别数据，然后将数据放到 Detection2DArray 类型的消息中并发布，完整代码如下：

def pub_result(self, result, header):
    """
    发布YOLOv5的识别结果


    Args:
        result: YOLOv5的识别结果数据
        header: ROS消息头信息
    """
    result_msg = Detection2DArray()
    result_msg.header = header # 同步 header


    predictions = result.pred[0]  # 获取结果张量
    boxes = predictions[:, :4]  # 框的坐标：x1, y1, x2, y2
    scores = predictions[:, 4]  # 置信度
    categories = predictions[:, 5]  # 类别


    for index in range(len(categories)):
        name = result.names[int(categories[index])]  # 根据分类id查询名字
        score = round(scores[index].item(), 2)  # 保留两位小数
        x1, y1, x2, y2 = map(int, boxes[index])  # 使用 map 转换成int型，方便使用


        detection2d = Detection2D()
        detection2d.id = str(index) # 以索引进行编号
        # 将角点转成边界框类型
        detection2d.bbox.center.position.x = (x1+x2)/2.0
        detection2d.bbox.center.position.y = (y1+y2)/2.0
        detection2d.bbox.size_x = float(x2-x1)
        detection2d.bbox.size_y = float(y2-y1)
        # 存储类型名称和置信度
        obj_pose = ObjectHypothesisWithPose()
        obj_pose.hypothesis.class_id = name
        obj_pose.hypothesis.score = score
        detection2d.results.append(obj_pose)
        result_msg.detections.append(detection2d)


    self.yolo_result_pub.publish(result_msg)

Detection2DArray 是由 Detection2D 组成的，所以我们在每次循环中都创建了一个 Detection2D 类的对象，然后将边界起点坐标和终点坐标转换成边界框的形式，接着将 detection2d 放到检测结果数组中。在函数的最后，将检测结果消息直接发布了出来。接着继续完善检测结果图像发布函数 pub_result_with_image，我们需要从检测结果中提取到边界框、置信度和类别数据，然后在原始图像上完成绘制并发布，完整代码如下：

def pub_result_with_image(self, result, image, header):
    """
    发布包含识别结果的图像


    Args:
        result: YOLOv5的识别结果数据
        image: 包含识别结果的图像数据
        header: ROS消息头信息
    """
    predictions = result.pred[0]  # 获取结果张量
    boxes = predictions[:, :4]  # 框的坐标：x1, y1, x2, y2
    scores = predictions[:, 4]  # 置信度
    categories = predictions[:, 5]  # 类别


    for index in range(len(categories)):
        name = result.names[int(categories[index])]  # 根据分类id查询名字
        score = round(scores[index].item(), 2)  # 保留两位小数
        x1, y1, x2, y2 = map(int, boxes[index])  # 使用 map 转换成int型，方便使用
        cv2.rectangle(image, (x1, y1), (x2, y2),
                      (0, 255, 0), 2)  # 在原始图像上绘制矩形框
        cv2.putText(image, f"{name}:{score}", (x1, y1),  # 在矩形框上显示类别名称
                    cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 255, 0), 1)


    result_img_msg = self.bridge.cv2_to_imgmsg(image, encoding="rgb8",header=header)
    self.result_img_pub.publish(result_img_msg)

上面的代码和发布结果代码相似，都是先提取结果，接着遍历类别数组，不同的是该函数循环中将结果在图像上进行了绘制，循环绘制完成后，调用了 cv2_to_imgmsg 将原始图像转换成 Image 消息格式，最后直接发布。完成了上面的代码，整个开源库就基本完成了，下面我们尝试对开源库进行使用测试。

您可能感兴趣的与本文相关的镜像