智慧城市之行人重识别，一起来看看如何让多目标跟踪又快又准！

最新推荐文章于 2025-04-08 09:00:00 发布

披荆斩棘的Jim

最新推荐文章于 2025-04-08 09:00:00 发布

阅读量2.2k

点赞数 2

文章标签：目标跟踪智慧城市人工智能

本文链接：https://blog.youkuaiyun.com/weixin_42538829/article/details/128614847

版权

文章目录

1 项目背景
- 智慧城市之行人重识别
2 项目方案
3 数据说明
4 代码实现
5 效果展示
6 总结提高
关于作者

1 项目背景

智慧城市之行人重识别

在公共安全领域，行人重识别一方面能够帮助快速筛查可疑人员，建立快速反应安全防控机制、精准重拳打击犯罪，如精准查找黄牛党，预防公共安全事故发生。另一方面，在机场、车站等人流拥挤的公共区域，利用行人识别技术，可以实现走失儿童和老人的快速查找。对维护人民群众切身利益，营造平安环境有着不可估量的作用。

在智能交通领域，利用行人重识别技术，可以实现实现人与人，甚至人与车的联系。帮助智能交通系统一起完成人、车与道路的完整自动调度闭环，这种技术能力对于自动驾驶时代同样适用。
同时，行人重识别其本身也是智慧城市的一个重要技术环节。通过行人识别技术不仅可以实现人流信息的统计，甚至包括全场景的人流轨迹还原及人员比对和查询，方便实时管理和调配各种终端资源，节省了大量人力和物力资源。

让我们来一个说干就干的项目！

2 项目方案

ByteTrack(ByteTrack: Multi-Object Tracking by Associating Every Detection Box) 是基于 tracking-by-detection 范式的跟踪方法。作者提出了一种简单高效的数据关联方法 BYTE。它和之前跟踪算法的最大区别在于，并不是简单的去掉低分检测结果，正如论文标题所述，Assiciating Every Detection Box。利用检测框和跟踪轨迹之间的相似性，在保留高分检测结果的同时，从低分检测结果中去除背景，挖掘出真正的物体（遮挡、模糊等困难样本），从而降低漏检并提高轨迹的连贯性。速度到 30 FPS（单张 V100），各项指标均有突破。就我个人 demo 测试来看，相比 deep sort，ByteTrack 在遮挡情况下的提升非常明显。但是需要注意的是，由于ByteTrack 没有采用外表特征进行匹配，所以跟踪的效果非常依赖检测的效果，也就是说如果检测器的效果很好，跟踪也会取得不错的效果，但是如果检测的效果不好，那么会严重影响跟踪的效果。

ByteTrack 的核心在于 BYTE，也就是说可以套用任何你自己的检测算法，把你的检测结果输入跟踪器即可，和 deepsort 类似，这种方式相比 JDE 和 FairMOT，在工程应用上更为简洁。

3 数据说明

MOT17-half train是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集，而为了验证精度可以都用MOT17-half val数据集去评估，它是每个视频的后一半帧组成的，数据集可以从此链接下载，并解压放在dataset/mot/文件夹下。

4 代码实现

4.1 环境搭建

# 将项目文件解压到work/下
! unzip /home/aistudio/data/data181683/PaddleDetection-release-2.5.zip -d work/

# 更改项目名称
! cd work/ && mv PaddleDetection-release-2.5 PaddleDetection

# 安装依赖
! cd work/PaddleDetection && pip install -r requirements.txt -i https://mirror.baidu.com/pypi/simple

4.2 数据集准备

# 下载数据集
! cd work/PaddleDetection/dataset/mot && wget https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip

--2022-12-14 10:26:36--  https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip
正在解析主机 bj.bcebos.com (bj.bcebos.com)... 100.67.200.6
正在连接 bj.bcebos.com (bj.bcebos.com)|100.67.200.6|:443... 已连接。
已发出 HTTP 请求，正在等待回应... 200 OK
长度： 2388186946 (2.2G) [application/zip]
正在保存至: “MOT17.zip”

MOT17.zip           100%[===================>]   2.22G  72.3MB/s    in 29s     

2022-12-14 10:27:05 (78.0 MB/s) - 已保存 “MOT17.zip” [2388186946/2388186946])

# 解压数据集
! cd work/PaddleDetection/dataset/mot && unzip MOT17.zip -d ./

4.3 模型训练

# 开始训练 
! cd work/PaddleDetection && python tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp

4.4 模型评估与预测

注意:

scaled表示在模型输出结果的坐标是否已经是缩放回原图的，如果使用的检测模型是JDE YOLOv3则为False，如果使用通用检测模型则为True, 默认值是False。
跟踪结果会存于{output_dir}/mot_results/中，里面每个视频序列对应一个txt，每个txt文件每行信息是frame,id,x1,y1,w,h,score,-1,-1,-1, 此外{output_dir}可通过
–output_dir设置，默认文件夹名为output。

# 开始评估
! cd work/PaddleDetection && CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml --scaled=True

Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W1214 10:39:26.566156  5295 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W1214 10:39:26.569891  5295 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[12/14 10:39:28] ppdet.utils.download INFO: Downloading ppyoloe_crn_l_36e_640x640_mot17half.pdparams from https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
100%|████████████████████████████████| 207958/207958 [00:02<00:00, 75516.46KB/s]
[12/14 10:39:31] ppdet.utils.checkpoint INFO: Finish resuming model weights: /home/aistudio/.cache/paddle/weights/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
[12/14 10:39:31] ppdet.utils.download INFO: Downloading deepsort_pplcnet.pdparams from https://bj.bcebos.com/v1/paddledet/models/mot/deepsort_pplcnet.pdparams
100%|██████████████████████████████████| 35909/35909 [00:00<00:00, 64703.48KB/s]
[12/14 10:39:32] ppdet.utils.checkpoint INFO: Finish resuming model weights: /home/aistudio/.cache/paddle/weights/deepsort_pplcnet.pdparams
[12/14 10:39:32] ppdet.engine.tracker INFO: Evaluate seq: MOT17-02-SDP-half
[12/14 10:39:32] ppdet.engine.tracker INFO: Found 300 inference images in total.
100%|█████████████████████████████████████████| 300/300 [00:36<00:00,  8.27it/s]
MOT results save in output/mot_results/MOT17-02-SDP-half.txt
[12/14 10:40:09] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:40:09] ppdet.engine.tracker INFO: Evaluate seq: MOT17-04-SDP-half
[12/14 10:40:09] ppdet.engine.tracker INFO: Found 525 inference images in total.
100%|█████████████████████████████████████████| 525/525 [01:13<00:00,  7.19it/s]
MOT results save in output/mot_results/MOT17-04-SDP-half.txt
[12/14 10:41:22] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:41:23] ppdet.engine.tracker INFO: Evaluate seq: MOT17-05-SDP-half
[12/14 10:41:23] ppdet.engine.tracker INFO: Found 419 inference images in total.
100%|█████████████████████████████████████████| 419/419 [00:33<00:00, 12.37it/s]
MOT results save in output/mot_results/MOT17-05-SDP-half.txt
[12/14 10:41:57] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:41:58] ppdet.engine.tracker INFO: Evaluate seq: MOT17-09-SDP-half
[12/14 10:41:58] ppdet.engine.tracker INFO: Found 263 inference images in total.
100%|█████████████████████████████████████████| 263/263 [00:24<00:00, 10.56it/s]
MOT results save in output/mot_results/MOT17-09-SDP-half.txt
[12/14 10:42:23] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:42:23] ppdet.engine.tracker INFO: Evaluate seq: MOT17-10-SDP-half
[12/14 10:42:23] ppdet.engine.tracker INFO: Found 327 inference images in total.
100%|█████████████████████████████████████████| 327/327 [00:37<00:00,  8.78it/s]
MOT results save in output/mot_results/MOT17-10-SDP-half.txt
[12/14 10:43:00] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:43:01] ppdet.engine.tracker INFO: Evaluate seq: MOT17-11-SDP-half
[12/14 10:43:01] ppdet.engine.tracker INFO: Found 450 inference images in total.
100%|█████████████████████████████████████████| 450/450 [00:45<00:00,  9.97it/s]
MOT results save in output/mot_results/MOT17-11-SDP-half.txt
[12/14 10:43:46] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:43:46] ppdet.engine.tracker INFO: Evaluate seq: MOT17-13-SDP-half
[12/14 10:43:46] ppdet.engine.tracker INFO: Found 375 inference images in total.
100%|█████████████████████████████████████████| 375/375 [00:39<00:00,  9.52it/s]
MOT results save in output/mot_results/MOT17-13-SDP-half.txt
[12/14 10:44:26] ppdet.metrics.mot_metrics INFO: In MOT16/17 dataset the valid_label of ground truth is '1', in other dataset it should be '0' for single classs MOT.
[12/14 10:44:26] ppdet.engine.tracker INFO: Time elapsed: 287.57 seconds, FPS: 9.25
                   IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT ML   FP    FN IDs   FM  MOTA  MOTP IDt IDa IDm
MOT17-02-SDP-half 34.5% 48.0% 26.9% 44.3% 78.9%  53   9  23 21 1175  5526 124  170 31.2% 0.202  21 102   3
MOT17-04-SDP-half 68.5% 72.6% 64.9% 75.1% 84.1%  69  28  32  9 3429  6019  97  249 60.5% 0.197  19  74   3
MOT17-05-SDP-half 62.7% 70.4% 56.4% 70.6% 88.1%  71  25  38  8  319   986  64   90 59.2% 0.206  17  50   5
MOT17-09-SDP-half 57.4% 68.2% 49.5% 68.4% 94.2%  22   9  11  2  121   911  36   55 62.9% 0.172   9  29   2
MOT17-10-SDP-half 55.1% 57.2% 53.1% 73.2% 78.9%  36  14  20  2 1159  1588  94  188 52.0% 0.233  23  68   4
MOT17-11-SDP-half 52.7% 54.9% 50.6% 73.0% 79.3%  44  13  22  9  860  1218  55   85 52.8% 0.156   2  54   1
MOT17-13-SDP-half 53.7% 48.3% 60.4% 77.1% 61.7%  44  22  18  4 1511   723  90  134 26.3% 0.241  17  69   6
OVERALL           58.3% 63.7% 53.8% 68.5% 81.2% 339 120 164 55 8574 16971 560  971 51.6% 0.200 108 446  24

# 下载测试视频
! cd work/PaddleDetection/dataset/mot && wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4

# 开始测试 使用ppyoloe检测行人 开启reid
! cd work/PaddleDetection && CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml --video_file=dataset/mot/mot17_demo.mp4 --scaled=True --save_videos

Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W1214 10:57:08.547623  8887 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W1214 10:57:08.551120  8887 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[12/14 10:57:10] ppdet.utils.checkpoint INFO: Finish resuming model weights: /home/aistudio/.cache/paddle/weights/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
[12/14 10:57:11] ppdet.utils.checkpoint INFO: Finish resuming model weights: /home/aistudio/.cache/paddle/weights/deepsort_pplcnet.pdparams
[12/14 10:57:21] ppdet.data.source.mot INFO: Length of the video: 200 frames.
[12/14 10:57:21] ppdet.engine.tracker INFO: Starting tracking video dataset/mot/mot17_demo.mp4
100%|█████████████████████████████████████████| 200/200 [00:30<00:00,  6.49it/s]
ffmpeg version 2.8.15-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10) 20160609
  configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-open
  libavutil      54. 31.100 / 54. 31.100
  libavcodec     56. 60.100 / 56. 60.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 40.101 /  5. 40.101
  libavresample   2.  1.  0 /  2.  1.  0
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
[0;36m[mjpeg @ 0xd44720] [0mChangeing bps to 8
Input #0, image2, from 'output/mot_outputs/mot17_demo/%05d.jpg':
  Duration: 00:00:08.00, start: 0.000000, bitrate: N/A
    Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 25 tbn, 25 tbc
[0;33mNo pixel format specified, yuvj420p for H.264 encoding chosen.
Use -pix_fmt yuv420p for compatibility with outdated media players.
[0m[1;36m[libx264 @ 0xd475c0] [0musing SAR=1/1
[1;36m[libx264 @ 0xd475c0] [0musing cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
[1;36m[libx264 @ 0xd475c0] [0mprofile High, level 4.0
[1;36m[libx264 @ 0xd475c0] [0m264 - core 148 r2643 5c65704 - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=34 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output/mot_outputs/mot17_demo/../mot17_demo_vis.mp4':
  Metadata:
    encoder         : Lavf56.40.101
    Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuvj420p(pc), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc
    Metadata:
      encoder         : Lavc56.60.100 libx264
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
Press [q] to stop, [?] for help
frame=  200 fps= 17 q=-1.0 Lsize=    8580kB time=00:00:07.92 bitrate=8874.3kbits/s    
video:8577kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.036778%
[1;36m[libx264 @ 0xd475c0] [0mframe I:2     Avg QP:23.04  size:101572
[1;36m[libx264 @ 0xd475c0] [0mframe P:111   Avg QP:24.90  size: 56100
[1;36m[libx264 @ 0xd475c0] [0mframe B:87    Avg QP:26.68  size: 27028
[1;36m[libx264 @ 0xd475c0] [0mconsecutive B-frames: 13.0% 87.0%  0.0%  0.0%
[1;36m[libx264 @ 0xd475c0] [0mmb I  I16..4: 12.6% 80.3%  7.0%
[1;36m[libx264 @ 0xd475c0] [0mmb P  I16..4:  7.4% 27.1%  1.9%  P16..4: 37.1% 12.4%  5.1%  0.0%  0.0%    skip: 9.0%
[1;36m[libx264 @ 0xd475c0] [0mmb B  I16..4:  2.0%  5.5%  0.4%  B16..8: 39.4%  9.5%  2.2%  direct: 4.3%  skip:36.5%  L0:45.2% L1:45.0% BI: 9.8%
[1;36m[libx264 @ 0xd475c0] [0m8x8 transform intra:74.1% inter:81.5%
[1;36m[libx264 @ 0xd475c0] [0mcoded y,uvDC,uvAC intra: 46.0% 50.2% 7.4% inter: 22.7% 28.7% 4.3%
[1;36m[libx264 @ 0xd475c0] [0mi16 v,h,dc,p: 27% 38% 12% 23%
[1;36m[libx264 @ 0xd475c0] [0mi8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 30% 26% 23%  5%  3%  3%  3%  4%  4%
[1;36m[libx264 @ 0xd475c0] [0mi4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 42% 20% 10%  5%  4%  6%  4%  5%  4%
[1;36m[libx264 @ 0xd475c0] [0mi8c dc,h,v,p: 47% 24% 27%  3%
[1;36m[libx264 @ 0xd475c0] [0mWeighted P-Frames: Y:1.8% UV:0.0%
[1;36m[libx264 @ 0xd475c0] [0mref P L0: 57.9% 16.0% 18.3%  7.8%  0.1%
[1;36m[libx264 @ 0xd475c0] [0mref B L0: 82.5% 17.5%
[1;36m[libx264 @ 0xd475c0] [0mkb/s:8781.65
[12/14 10:58:04] ppdet.engine.tracker INFO: Save video in output/mot_outputs/mot17_demo/../mot17_demo_vis.mp4
MOT results save in output/mot_results/mot17_demo.txt

# 开始测试 使用ppyoloe检测行人 关闭reid
! cd work/PaddleDetection && CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe.yml --video_file=dataset/mot/mot17_demo.mp4 --scaled=True --save_videos

Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
W1214 10:50:46.245635  7314 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W1214 10:50:46.249289  7314 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[12/14 10:50:48] ppdet.utils.checkpoint INFO: Finish resuming model weights: /home/aistudio/.cache/paddle/weights/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
[12/14 10:50:59] ppdet.data.source.mot INFO: Length of the video: 200 frames.
[12/14 10:50:59] ppdet.engine.tracker INFO: Starting tracking video dataset/mot/mot17_demo.mp4
100%|█████████████████████████████████████████| 200/200 [00:17<00:00, 11.37it/s]
ffmpeg version 2.8.15-0ubuntu0.16.04.1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10) 20160609
  configuration: --prefix=/usr --extra-version=0ubuntu0.16.04.1 --build-suffix=-ffmpeg --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --cc=cc --cxx=g++ --enable-gpl --enable-shared --disable-stripping --disable-decoder=libopenjpeg --disable-decoder=libschroedinger --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp --enable-libschroedinger --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzvbi --enable-openal --enable-open
  libavutil      54. 31.100 / 54. 31.100
  libavcodec     56. 60.100 / 56. 60.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 40.101 /  5. 40.101
  libavresample   2.  1.  0 /  2.  1.  0
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
[0;36m[mjpeg @ 0x1eb4720] [0mChangeing bps to 8
Input #0, image2, from 'output/mot_outputs/mot17_demo/%05d.jpg':
  Duration: 00:00:08.00, start: 0.000000, bitrate: N/A
    Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 25 tbn, 25 tbc
[0;33mNo pixel format specified, yuvj420p for H.264 encoding chosen.
Use -pix_fmt yuv420p for compatibility with outdated media players.
[0m[1;36m[libx264 @ 0x1eb75c0] [0musing SAR=1/1
[1;36m[libx264 @ 0x1eb75c0] [0musing cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
[1;36m[libx264 @ 0x1eb75c0] [0mprofile High, level 4.0
[1;36m[libx264 @ 0x1eb75c0] [0m264 - core 148 r2643 5c65704 - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=34 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output/mot_outputs/mot17_demo/../mot17_demo_vis.mp4':
  Metadata:
    encoder         : Lavf56.40.101
    Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuvj420p(pc), 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc
    Metadata:
      encoder         : Lavc56.60.100 libx264
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
Press [q] to stop, [?] for help
frame=  200 fps= 15 q=-1.0 Lsize=    8599kB time=00:00:07.92 bitrate=8894.6kbits/s    
video:8596kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.036694%
[1;36m[libx264 @ 0x1eb75c0] [0mframe I:2     Avg QP:23.07  size:101562
[1;36m[libx264 @ 0x1eb75c0] [0mframe P:111   Avg QP:24.91  size: 56251
[1;36m[libx264 @ 0x1eb75c0] [0mframe B:87    Avg QP:26.70  size: 27067
[1;36m[libx264 @ 0x1eb75c0] [0mconsecutive B-frames: 13.0% 87.0%  0.0%  0.0%
[1;36m[libx264 @ 0x1eb75c0] [0mmb I  I16..4: 12.7% 80.3%  7.0%
[1;36m[libx264 @ 0x1eb75c0] [0mmb P  I16..4:  7.4% 27.0%  1.9%  P16..4: 37.1% 12.5%  5.1%  0.0%  0.0%    skip: 9.0%
[1;36m[libx264 @ 0x1eb75c0] [0mmb B  I16..4:  2.0%  5.6%  0.4%  B16..8: 39.4%  9.6%  2.2%  direct: 4.3%  skip:36.4%  L0:45.2% L1:44.9% BI: 9.9%
[1;36m[libx264 @ 0x1eb75c0] [0m8x8 transform intra:73.9% inter:81.3%
[1;36m[libx264 @ 0x1eb75c0] [0mcoded y,uvDC,uvAC intra: 46.0% 50.2% 7.5% inter: 22.7% 28.7% 4.3%
[1;36m[libx264 @ 0x1eb75c0] [0mi16 v,h,dc,p: 27% 38% 12% 23%
[1;36m[libx264 @ 0x1eb75c0] [0mi8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 30% 26% 23%  5%  3%  3%  3%  4%  4%
[1;36m[libx264 @ 0x1eb75c0] [0mi4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 42% 20% 10%  5%  4%  5%  4%  5%  4%
[1;36m[libx264 @ 0x1eb75c0] [0mi8c dc,h,v,p: 47% 24% 27%  3%
[1;36m[libx264 @ 0x1eb75c0] [0mWeighted P-Frames: Y:1.8% UV:0.0%
[1;36m[libx264 @ 0x1eb75c0] [0mref P L0: 57.9% 15.9% 18.3%  7.8%  0.1%
[1;36m[libx264 @ 0x1eb75c0] [0mref B L0: 82.5% 17.5%
[1;36m[libx264 @ 0x1eb75c0] [0mkb/s:8801.77
[12/14 10:51:30] ppdet.engine.tracker INFO: Save video in output/mot_outputs/mot17_demo/../mot17_demo_vis.mp4
MOT results save in output/mot_results/mot17_demo.txt

4.5 模型导出

# 1、导出检测模型
# 导出PPYOLOe行人检测模型
! cd work/PaddleDetection && CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams

Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[12/14 12:57:34] ppdet.utils.download INFO: Downloading ppyoloe_crn_l_36e_640x640_mot17half.pdparams from https://paddledet.bj.bcebos.com/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
100%|████████████████████████████████| 207958/207958 [00:02<00:00, 74853.99KB/s]
[12/14 12:57:38] ppdet.utils.checkpoint INFO: ['yolo_head.anchor_points', 'yolo_head.stride_tensor'] in pretrained weight is not used in the model, and its will not be loaded
[12/14 12:57:38] ppdet.utils.checkpoint INFO: Finish loading model weights: /home/aistudio/.cache/paddle/weights/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
loading annotations into memory...
Done (t=0.19s)
creating index...
index created!
[12/14 12:57:39] ppdet.engine INFO: Export inference config file to output_inference/ppyoloe_crn_l_36e_640x640_mot17half/infer_cfg.yml
[12/14 12:57:49] ppdet.engine INFO: Export model and saved in output_inference/ppyoloe_crn_l_36e_640x640_mot17half

# 2、导出ReID模型（默认不需要，可选）
# 导出PPLCNet ReID模型
! cd work/PaddleDetection && CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pplcnet.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams

Warning: import ppdet from source directory without installing, run 'python setup.py install' to install ppdet firstly
[12/14 12:15:25] ppdet.utils.download INFO: Downloading deepsort_pplcnet.pdparams from https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
100%|██████████████████████████████████| 35909/35909 [00:02<00:00, 16700.24KB/s]
[12/14 12:15:28] ppdet.utils.checkpoint INFO: Finish resuming model weights: /home/aistudio/.cache/paddle/weights/deepsort_pplcnet.pdparams
[12/14 12:15:28] ppdet.data.source.category WARNING: anno_file 'None' is None or not set or not exist, please recheck TrainDataset/EvalDataset/TestDataset.anno_path, otherwise the default categories will be used by metric_type.
[12/14 12:15:28] ppdet.data.source.category WARNING: metric_type: MOT, load default categories of pedestrian MOT.
[12/14 12:15:28] ppdet.engine INFO: Export inference config file to output_inference/deepsort_pplcnet/infer_cfg.yml
[12/14 12:15:30] ppdet.engine INFO: Export model and saved in output_inference/deepsort_pplcnet

4.6 用导出的模型进行基于Python的推理

! cd work/PaddleDetection && python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=dataset/mot/mot17_demo.mp4 --device=GPU --save_mot_txts

-----------  Running Arguments -----------
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
do_break_in_counting: False
do_entrance_counting: False
draw_center_traj: False
enable_mkldnn: False
image_dir: None
image_file: None
model_dir: output_inference/ppyoloe_crn_l_36e_640x640_mot17half/
mtmct_cfg: None
mtmct_dir: None
output_dir: output
region_polygon: []
region_type: horizontal
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: paddle
save_images: False
save_mot_txt_per_img: False
save_mot_txts: True
scaled: False
secs_interval: 2
skip_frame_num: -1
threshold: 0.5
tracker_config: deploy/pptracking/python/tracker_config.yml
trt_calib_mode: False
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: False
video_file: dataset/mot/mot17_demo.mp4
------------------------------------------
-----------  Model Configuration -----------
Model Arch: YOLO
Transform Order: 
--transform op: Resize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
fps: 25, frame_count: 200
Tracking frame: 0
Tracking frame: 10
Tracking frame: 20
Tracking frame: 30
Tracking frame: 40
Tracking frame: 50
Tracking frame: 60
Tracking frame: 70
Tracking frame: 80
Tracking frame: 90
Tracking frame: 100
Tracking frame: 110
Tracking frame: 120
Tracking frame: 130
Tracking frame: 140
Tracking frame: 150
Tracking frame: 160
Tracking frame: 170
Tracking frame: 180
Tracking frame: 190
MOT results save in output/mot17_demo.txt
Flow statistic save in output/mot17_demo_flow_statistic.txt

me: 130
Tracking frame: 140
Tracking frame: 150
Tracking frame: 160
Tracking frame: 170
Tracking frame: 180
Tracking frame: 190
MOT results save in output/mot17_demo.txt
Flow statistic save in output/mot17_demo_flow_statistic.txt

5 效果展示

推理结果的展式视频见output/。

ByteTrack	PPYOLOE+ReID	PPYOLOE	Deploy
FPS	6.49	11.37	25

6 总结提高

PaddleDetection提供了的ByteTrack的训练、ReID选用和推理部署，本项目是在MOT17数据集上的一个应用案例，对比了ReID关闭前后的跟踪速度和效果，可以看到Deploy下ByteTrack具备的推理速度可满足落地要求。后续将补充在Nvidia Jetson NX上的部署适配。