YOLO + DeepSort 的视频目标检测与跟踪全解析

原创已于 2025-08-12 11:06:06 修改 · 1.3k 阅读

7 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO #目标检测 #目标跟踪 #深度学习 #人工智能 #计算机视觉

于 2025-08-12 11:05:09 首次发布

部署运行你感兴趣的模型镜像

在看过一些“能识别视频中每一个物体并持续跟踪”的演示视频后，你可能会以为背后是一套极其复杂的系统。其实，利用 YOLO 模型配合 DeepSort 算法，就能实现无缝的目标检测与跟踪。

如果你不想花时间在环境配置、模型下载、依赖安装这些“开场的麻烦事”上，可以直接用 Coovally 平台——里面已经内置了 YOLOv3/v4/v5/v7/v8/11、Faster R-CNN、RetinaNet、DETR、DeepSort、Mask R-CNN 等主流与前沿检测、跟踪模型，一键加载、自由组合。

一、YOLO 与 DeepSort 如何协作？

YOLO 作为检测器，逐帧运行，输出每个检测目标的边界框位置、类别和置信度。
DeepSort 作为跟踪器，接收 YOLO 的检测结果，并结合历史跟踪信息，为每个目标分配唯一 ID，实现跨帧跟踪。

每个跟踪轨迹包含：

预测边界框（由卡尔曼滤波器预测）
唯一轨迹 ID
运动模型（Kalman Filter）
外观特征向量（embedding）

由于 YOLO 已经被广泛熟知，下面的重点放在 DeepSort 上。

二、DeepSort 的核心思想

DeepSort 源自论文《Simple Online and Realtime Tracking with a Deep Association Metric》，是对 SORT 算法的改进版。

SORT：依赖卡尔曼滤波（预测目标位置）+ 匈牙利算法（基于 IOU 进行帧间匹配）。
缺陷：遮挡时容易出现 ID 切换（ID Switch）。
DeepSort 改进：在匹配时不仅考虑运动信息，还引入外观特征（embedding），显著减少 ID 切换。

两类关键信息

1.外观特征

使用预训练 CNN 提取 128-D 或 256-D 向量，描述目标的外观信息。
每一帧中，检测到的目标图像会送入 CNN 提取 embedding。

2.运动信息

使用卡尔曼滤波器预测目标的下一位置（位置+速度）。
即便目标被部分遮挡，也能通过历史轨迹推测位置。

三、Tracker 类的核心方法

在 DeepSort 中，Tracker 是主入口类，整合了多个模块(如detection.py、iou_matching.py、linear_assignment.py 等)。

predict()

作用：推进所有轨迹的状态预测一步，通常在每一帧开始时调用。


# deep_sort/deep_sort/tracker.py
def predict(self):
    """Propagate track state distributions one time step forward.
    This function should be called once every time step, before `update`.
    """
    for track in self.tracks:
        track.predict(self.kf)

update(detections)

作用：更新轨迹集，包括匹配检测、标记丢失目标、新建轨迹，并更新度量器。

# deep_sort/deep_sort/tracker.py
def update(self, detections):
    # Run matching cascade.
    matches, unmatched_tracks, unmatched_detections = \
        self._match(detections)
    # Update track set.
    for track_idx, detection_idx in matches:
        self.tracks[track_idx].update(
            self.kf, detections[detection_idx])
    for track_idx in unmatched_tracks:
        self.tracks[track_idx].mark_missed()
    for detection_idx in unmatched_detections:
        self._initiate_track(detections[detection_idx])
    self.tracks = [t for t in self.tracks if not t.is_deleted()]
    # Update distance metric.
    active_targets = [t.track_id for t in self.tracks if t.is_confirmed()]
    features, targets = [], []
    for track in self.tracks:
        if not track.is_confirmed():
            continue
        features += track.features
        targets += [track.track_id for _ in track.features]
        track.features = []
    self.metric.partial_fit(
        np.asarray(features), np.asarray(targets), active_targets)

_match(detections)

作用：将当前检测与已有轨迹匹配，先用外观+运动特征匹配，再用 IOU 弥补。


# deep_sort/deep_sort/tracker.py
def _match(self, detections):
    def gated_metric(tracks, dets, track_indices, detection_indices):
        features = np.array([dets[i].feature for i in detection_indices])
        targets = np.array([tracks[i].track_id for i in track_indices])
        cost_matrix = self.metric.distance(features, targets)
        cost_matrix = linear_assignment.gate_cost_matrix(
            self.kf, cost_matrix, tracks, dets, track_indices,
            detection_indices)
        return cost_matrix
    # Split track set into confirmed and unconfirmed tracks.
    confirmed_tracks = [
        i for i, t in enumerate(self.tracks) if t.is_confirmed()]
    unconfirmed_tracks = [
        i for i, t in enumerate(self.tracks) if not t.is_confirmed()]
    matches_a, unmatched_tracks_a, unmatched_detections = \
        linear_assignment.matching_cascade(
            gated_metric, self.metric.matching_threshold, self.max_age,
            self.tracks, detections, confirmed_tracks)
    # Associate remaining tracks together with unconfirmed tracks using IOU.
    iou_track_candidates = unconfirmed_tracks + [
        k for k in unmatched_tracks_a if
        self.tracks[k].time_since_update == 1]
    unmatched_tracks_a = [
        k for k in unmatched_tracks_a if
        self.tracks[k].time_since_update != 1]
    matches_b, unmatched_tracks_b, unmatched_detections = \
        linear_assignment.min_cost_matching(
            iou_matching.iou_cost, self.max_iou_distance, self.tracks,
            detections, iou_track_candidates, unmatched_detections)
    matches = matches_a + matches_b
    unmatched_tracks = list(set(unmatched_tracks_a + unmatched_tracks_b))
    return matches, unmatched_tracks, unmatched_detections

_initiate_track(detection)

作用：初始化新的轨迹。


# deep_sort/deep_sort/tracker.py
def _initiate_track(self, detection):
    mean, covariance = self.kf.initiate(detection.to_xyah())
    self.tracks.append(Track(
        mean, covariance, self._next_id, self.n_init, self.max_age,
        detection.feature))
    self._next_id += 1