COCO key point evaluation metric

COCO关键点评估采用对象关键点相似性(OKS)作为相似性度量,类似于对象检测中的IoU。OKS计算预测关键点与真实关键点之间的欧氏距离,并考虑对象规模和关键点可见性,用于评估关键点检测的精度。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://cocodataset.org/#keypoints-eval

1. Keypoint Evaluation

This page describes the keypoint evaluation metrics used by COCO. The evaluation code provided here can be used to obtain results on the publicly available COCO validation set. It computes multiple metrics described below. To obtain results on the COCO test set, for which ground-truth annotations are hidden, generated results must be uploaded to the evaluation server. The exact same evaluation code, described below, is used to evaluate results on the test set.

1.1. Evaluation Overview

The COCO keypoint task requires simultaneously detecting objects and localizing their keypoints (object locations are not given at test time). As the task of simultaneous detection and keypoint estimation is relatively new, we chose to adopt a novel metric inspired by object detection metrics. For simplicity, we refer to this task as keypoint detection and the prediction algorithm as the keypoint detector. We suggest reviewing the evaluation metrics for object detection before proceeding.

The core idea behind evaluating keypoint detection is to mimic the evaluation metrics used for object detection, namely average precision (AP) and average recall (AR) and their variants. At the heart of these metrics is a similarity measure between ground truth objects and predicted objects. In the case of object detection, the IoU serves as this similarity measure (for both boxes and segments). Thesholding the IoU defines matches between the ground truth and predicted objects and allows computing precision-recall curves. To adopt AP/AR for keypoints detection, we only need to define an analogous similarity measure. We do so by defining an object keypoint similarity (OKS) which plays the same role as the IoU.

1.2. Object Keypoint Similarity

For each object, ground truth keypoints have the form [x1,y1,v1,...,xk,yk,vk], where x,y are the keypoint locations and v is a visibility flag defined as v=0: not labeled, v=1: labeled but not visible, and v=2: labeled and visible. Each ground truth object also has a scale s which we define as the square root of the object segment area. For details on the ground truth format please see the download page.

For each object, the keypoint detector must output keypoint locations and an object-level confidence. Predicted keypoints for an object should have the same form as the ground truth: [x1,y1,v1,...,xk,yk,vk]. However, the detector's predicted vi are notcurrently used during evaluation, that is the keypoint detector is not required to predict per-keypoint visibilities or confidences.

We define the object keypoint similarity (OKS) as:

OKS = Σi[exp(-di2/2s2κi2)δ(vi>0)] / Σi[δ(vi>0)]

The di are the Euclidean distances between each corresponding ground truth and detected keypoint and the vi are the visibility flags of the ground truth (the detector's predicted vi are not used). To compute OKS, we pass the di through an unnormalized Guassian with standard deviation sκi, where s is the object scale and κi is a per-keypont constant that controls falloff. For each keypoint this yields a keypoint similarity that ranges between 0 and 1. These similarities are averaged over all labeled keypoints (keypoints for which vi>0). Predicted keypoints that are not labeled (vi=0) do not affect the OKS. Perfect predictions will have OKS=1 and predictions for which all keypoints are off by more than a few standard deviations sκi will have OKS~0. The OKS is analogous to the IoU. Given the OKS, we can compute AP and AR just as the IoU allows us to compute these metrics for box/segment detection.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值