【目标跟踪】Sort代码详解

慕容紫英问情

已于 2025-02-18 16:38:25 修改

阅读量1.2k

点赞数 19

分类专栏：目标跟踪文章标签：计算机视觉人工智能目标跟踪

于 2025-02-18 16:19:02 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_43148969/article/details/145699396

版权

多目标跟踪 Sort

今天来快速阅读并理解一篇论文：A SIMPLE AERIAL DETECTION BASELINE OF MULTIMODAL LANGUAGE MODELS

简介
SORT 是一个极简的视觉多目标跟踪框架实现，基于基本的数据关联和状态估计技术。它专为在线跟踪应用而设计，这些应用仅能获取过去和当前的帧，该方法可实时生成目标身份。虽然这个简约的跟踪器无法处理遮挡或重新进入视野的目标，但其目的是为未来跟踪器的开发提供一个基准和试验平台。
SORT 最初在这篇论文中被介绍。在首次发表时，SORT 在 MOT 基准测试中被评为最佳开源多目标跟踪器。
注意：SORT 的准确性在很大程度上归功于检测结果。

依赖项

要安装所需的依赖项，请运行：

$ pip install -r requirements.txt

演示

要使用提供的检测结果运行跟踪器：

$ cd path/to/sort
$ python sort.py

要显示结果，你需要：

下载2D MOT 2015基准数据集。
创建一个指向该数据集的符号链接：

$ ln -s /path/to/MOT2015_challenge/data/2DMOT2015 mot_benchmark

使用 --display 标志运行演示：

$ python sort.py --display

Sort.py源码

"""
    SORT: A Simple, Online and Realtime Tracker
    Copyright (C) 2016-2020 Alex Bewley alex@bewley.ai

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
"""
from __future__ import print_function

import os
import numpy as np
import matplotlib
# 使用 TkAgg 后端来显示图形
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from skimage import io

import glob
import time
import argparse
# 导入卡尔曼滤波器
from filterpy.kalman import KalmanFilter

# 设置随机种子，确保结果可复现
np.random.seed(0)


def linear_assignment(cost_matrix):
    """
    线性分配函数，用于解决检测框和跟踪器之间的匹配问题
    :param cost_matrix: 代价矩阵
    :return: 匹配结果的索引数组
    """
    try:
        import lap
        # 使用 lap 库的 lapjv 算法进行线性分配
        _, x, y = lap.lapjv(cost_matrix, extend_cost=True)
        return np.array([[y[i], i] for i in x if i >= 0])
    except ImportError:
        # 如果没有安装 lap 库，使用 scipy 的 linear_sum_assignment 算法
        from scipy.optimize import linear_sum_assignment
        x, y = linear_sum_assignment(cost_matrix)
        return np.array(list(zip(x, y)))


def iou_batch(bb_test, bb_gt):
    """
    计算两组边界框之间的交并比（IOU）
    :param bb_test: 测试边界框，格式为 [x1,y1,x2,y2]
    :param bb_gt: 真实边界框，格式为 [x1,y1,x2,y2]
    :return: IOU 矩阵
    """
    # 扩展维度，方便进行广播操作
    bb_gt = np.expand_dims(bb_gt, 0)
    bb_test = np.expand_dims(bb_test, 1)

    # 计算交集的左上角和右下角坐标
    xx1 = np.maximum(bb_test[..., 0], bb_gt[..., 0])
    yy1 = np.maximum(bb_test[..., 1], bb_gt[..., 1])
    xx2 = np.minimum(bb_test[..., 2], bb_gt[..., 2])
    yy2 = np.minimum(bb_test[..., 3], bb_gt[..., 3])
    # 计算交集的宽度和高度
    w = np.maximum(0., xx2 - xx1)
    h = np.maximum(0., yy2 - yy1)
    # 计算交集的面积
    wh = w * h
    # 计算交并比
    o = wh / ((bb_test[..., 2] - bb_test[..., 0]) * (bb_test[..., 3] - bb_test[..., 1])
              + (bb_gt[..., 2] - bb_gt[..., 0]) * (bb_gt[..., 3] - bb_gt[..., 1]) - wh)
    return o



def convert_bbox_to_z(bbox):
    """
    将边界框从 [x1,y1,x2,y2] 格式转换为 [x,y,s,r] 格式
    :param bbox: 边界框，格式为 [x1,y1,x2,y2]
    :return: 转换后的向量，[x,y,s,r] 格式，x,y 是框的中心，s 是尺度/面积，r 是宽高比
    """
    w = bbox[2] - bbox[0]
    h = bbox[3] - bbox[1]
    x = bbox[0] + w / 2.
    y = bbox[1] + h / 2.
    s = w * h  # 尺度即面积
    r = w / float(h)
    return np.array([x, y, s, r]).reshape((4, 1))


def convert_x_to_bbox(x, score=None):
    """
    将边界框从 [x,y,s,r] 格式转换为 [x1,y1,x2,y2] 格式
    :param x: 边界框，格式为 [x,y,s,r]
    :param score: 检测分数
    :return: 转换后的边界框，格式为 [x1,y1,x2,y2] 或 [x1,y1,x2,y2,score]
    """
    w = np.sqrt(x[2] * x[3])
    h = x[2] / w
    if score is None:
        return np.array([x[0] - w / 2., x[1] - h / 2., x[0] + w / 2., x[1] + h / 2.]).reshape((1, 4))
    else:
        return np.array([x[0] - w / 2., x[1] - h / 2., x[0] + w / 2., x[1] + h / 2., score]).reshape((1, 5))


class KalmanBoxTracker(object):
    """
    该类表示以边界框形式观察到的单个跟踪对象的内部状态
    """
    # 跟踪器计数
    count = 0

    def __init__(self, bbox):
        """
        使用初始边界框初始化跟踪器
        :param bbox: 初始边界框，格式为 [x1,y1,x2,y2]
        """
        # 定义恒速模型
        self.kf = KalmanFilter(dim_x=7, dim_z=4)
        # 状态转移矩阵
        self.kf.F = np.array([[1, 0, 0, 0, 1, 0, 0], [0, 1, 0, 0, 0, 1,

最低0.47元/天解锁文章