多人协同标注与训练时空行为数据集_3d resnet行为数据标注-优快云博客

本文链接：https://blog.youkuaiyun.com/WhiffeYF/article/details/125243361

github：https://github.com/Whiffe/Multi-person-collaborative-labeling-of-spatio-temporal-behavior-datasets-with-training
码云：https://gitee.com/YFwinston/Multi-person-collaborative-labeling-of-spatio-temporal-behavior-datasets-with-training

conda install x264 ffmpeg -c conda-forge -y

apt-get update
apt-get install git -y

apt-get update
apt-get install zip
apt-get install unzip

cd /home
git clone https://gitee.com/YFwinston/MPCLST.git

cd /home/MPCLST/Dataset
bash addDatasetXX.sh Dataset01

在/home/MPCLST/Dataset/Dataset01/videos 中上传第一批视频

我在/user-data下创建了studentVideo文件夹
（可）

cd /user-data/studentVideo/
mkdir -p ./MVideo/video01
mkdir -p ./MVideo_crop/video_crop01

然后将第一批视频上传到：/user-data/studentVideo/MVideo/video01
备注：我是5个视频为一组
（可）

 cp /user-data/studentVideo/MVideo/video01/* /home/MPCLST/Dataset/Dataset01/videos/

在/home/MPCLST/Dataset/Dataset01/cutVideos.txt中写入

10001.mp4 6 35 2135
10002.mp4 1340 2165 2710 2746
10003.mp4 530 581 1980 2115 2160 2205
10004.mp4 25 310 360 550 605  1200 
10005.mp4 30 140 540 1202 1625 2074

裁剪视频

cd /home/MPCLST/Dataset
rm -r ./Dataset01/video_crop/*
bash cutVideos.sh ./Dataset01/videos ./Dataset01/video_crop ./Dataset01/cutVideos.txt

抽帧

cd /home/MPCLST/Dataset
rm -r ./Dataset01/frames/*
bash cut_frames.sh ./Dataset01/video_crop ./Dataset01/frames

整合与缩减帧

抽帧中产生的frames文件夹的结构，在后续yolov5检测时会出现不方便，所以我采用下面的方式，将所有的图片放在了一个文件夹（choose_frames_all）中。
同时，并不是，所有图片都需要检测与标注，在15秒的视频中，检测标注：x_000001.jpg、x_000031.jpg、x_000061.jpg、x_000091.jpg、x_0000121jpg、x_000151.jpg、x_000181.jpg、x_000211.jpg、x_000241.jpg、x_000271.jpg、x_000301.jpg、x_000331.jpg、x_000361.jpg等。

cd /home/MPCLST/Dataset
rm -r ./Dataset01/choose_frames_all/*
python choose_frames_all.py --DatasetXX_dir Dataset01

不整合的缩减
整合与缩减是为了yolov5的检测，这里的不整合的缩减是为了via的标注。

cd /home/MPCLST/Dataset
rm -r ./Dataset01/choose_frames/*
python choose_frames.py --DatasetXX_dir Dataset01

yolov5与deep sort 安装

cd /home/MPCLST/yolovDeepsort
pip install -r requirements.txt
pip install opencv-python-headless==4.1.2.30

mkdir -p /root/.config/Ultralytics/
cp /user-data/yolov5File/crowdhuman_vbody_yolov5m.pt /home/MPCLST/yolovDeepsort/yolov5/crowdhuman_vbody_yolov5m.pt 
cp /user-data/yolov5File/Arial.ttf /root/.config/Ultralytics/Arial.ttf

对choose_frames_all进行检测

cd /home/MPCLST/yolovDeepsort
rm -r ./yolov5/runs/*
python ./yolov5/detect.py --source ../Dataset/Dataset01/choose_frames_all/ --save-txt --save-conf --weights ./yolov5/crowdhuman_vbody_yolov5m.pt --hide-labels --line-thickness 2

异常框筛选

cd /home/MPCLST/yolovDeepsort/yolov5
mkdir -p ./runs/detect/newExp/
mkdir -p ./runs/detect/visualize
rm -r  ./runs/detect/newExp/*
rm -r ./runs/detect/visualize/*
python filter.py --label_dir ./runs/detect/exp/labels --image_dir ../../Dataset/Dataset01/choose_frames_all/ --newExp_dir ./runs/detect/newExp/

yoloV5转via
（可）

cd /home/MPCLST/yolovDeepsort/yolov5
xxxx python yolo2via.py --yoloLabel_dir ./runs/detect/newExp/ --image_dir ../../Dataset/Dataset01/choose_frames_all/

生成dense_proposals_train.pkl

cd /home/MPCLST/yolovDeepsort/mywork
python dense_proposals_train.py

导入via
choose_frames_all_middle
Dataset/Dataset01 下的 choose_frames 文件夹中包含15秒视频中15张图片，但是在最后生成的标注文件，不包含前2张图片和后2张图片。所以需要创建一个choose_frames_middle文件夹，存放不含前2张图片与后2张图片的文件夹。

cd /home/MPCLST/Dataset/
python choose_frames_middle.py --DatasetXX_dir Dataset01

生成via标注文件
自定义动作在：yolovDeepsort/mywork/dense_proposals_train_to_via.py文件中，具体位置如下图：
在这里插入图片描述

cd /home/MPCLST/yolovDeepsort/mywork/
python dense_proposals_train_to_via.py --DatasetXX_dir Dataset01

去掉via默认值
标注时有默认值，这个会影响我们的标注，需要取消掉。
我尝试了很多次，想在生成via标注文件时，去掉标注选项中的默认值，结果还是没有实现，那就在生成之后，直接对via的json文件进行操作，去掉（或者修改）默认值，根据自己的需要手动调整，调整内容如下：
在这里插入图片描述

cd /home/MPCLST/Dataset
python chang_via_json.py --DatasetXX_dir Dataset01

下载choose_frames_middle与VIA标注

cd /home/MPCLST/Dataset
zip -r choose_frames_middle.zip ./Dataset01/choose_frames_middle

提取上传标注完成的json文件
这里需要注意的是，我给每个标注完成的文件取名：视频名_finish.json，如视频1000204，标注完成后的名字为：1000204_finish.json

cd  /home/MPCLST/Dataset/
python json_extract.py --DatasetXX_dir Dataset01

deep sort
dense_proposals_train_deepsort.py
由于deepsort需要提前送入2帧图片，然后才能从第三帧开始标注人的ID，dense_proposals_train.pkl是从第三张开始的（即缺失了0，1），所以需要将0，1添加

cd /home/MPCLST/yolovDeepsort/mywork
python dense_proposals_train_deepsort.py

接下来使用deep sort来关联人的ID
将图片与yolov5检测出来的坐标，送入deep sort进行检测

cd /home/MPCLST/yolovDeepsort/
#wget https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6 -O ./deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7 
cp /user-data/yolov5File/ckpt.t7 ./deep_sort_pytorch/deep_sort/deep/checkpoint/ckpt.t7 
python yolov5_to_deepsort.py --source ../Dataset/Dataset01/frames

ckpt.t7 可以单独下载后上传AI平台

结果在：/home/MPCLST/Dataset/train_personID.csv，如下图
在这里插入图片描述
融合actions与personID
目前已经有2个文件了：
1，train_personID.csv
包含坐标、personID
2，train_without_personID.csv
包含坐标、actions

所以现在需要将两者拼在一起

cd  /home/MPCLST/Dataset/
python train_temp.py

最后结果：/home/MPCLST/Dataset/train_temp.csv
在这里插入图片描述
运行结束后，会发现有些ID是-1，这些-1是deepsort未检测出来的数据，原因是人首次出现或者出现时间过短，deepsort未检测出ID。

修正ava_train_temp.csv
针对train_temp.csv中存在-1的情况，需要进行修正

cd /home/MPCLST/Dataset/
python train.py --DatasetXX_dir Dataset01

结果在：/home/MPCLST/Dataset/Dataset01/annotations/train.csv
在这里插入图片描述

其它标注文件的生成
included_timestamps.txt
然后在included_timestamps.txt 中写入

action_list.pbtxt

item {
  name: "eye Invisible"
  id: 1
}
item {
  name: "eye ohter"
  id: 2
}
item {
  name: "open eyes"
  id: 3
}
item {
  name: "close eyes"
  id: 4
}
item {
  name: "lip Invisible"
  id: 5
}
item {
  name: "lip ohter"
  id: 6
}
item {
  name: "open mouth"
  id: 7
}
item {
  name: "close mouth"
  id: 8
}
item {
  name: "body Invisible"
  id: 9
}
item {
  name: "body other"
  id: 10
}
item {
  name: "body sit"
  id: 11
}
item {
  name: "body side Sit"
  id: 12
}
item {
  name: "body stand"
  id: 13
}
item {
  name: "body lying down"
  id: 14
}
item {
  name: "body bend over"
  id: 15
}
item {
  name: "body squat"
  id: 16
}
item {
  name: "body rely"
  id: 17
}
item {
  name: "body lie flat"
  id: 18
}
item {
  name: "body lateral"
  id: 19
}
item {
  name: "left hand invisible"
  id: 20
}
item {
  name: "left hand other"
  id: 21
}
item {
  name: "left hand palm grip"
  id: 22
}
item {
  name: "left hand palm spread"
  id: 23
}
item {
  name: "left hand palm Point"
  id: 24
}
item {
  name: "left hand applause"
  id: 25
}
item {
  name: "left hand write"
  id: 26
}
item {
  name: "left arm invisible"
  id: 27
}
item {
  name: "left arm other"
  id: 28
}
item {
  name: "left arm flat"
  id: 29
}
item {
  name: "left arm droop"
  id: 30
}
item {
  name: "left arm forward"
  id: 31
}
item {
  name: "left arm flexion"
  id: 32
}
item {
  name: "left arm raised"
  id: 33
}
item {
  name: "left handed behavior object invisible"
  id: 34
}
item {
  name: "left handed behavior object other"
  id: 35
}
item {
  name: "left handed behavior object book "
  id: 36
}
item {
  name: "left handed behavior object exercise book"
  id: 37
}
item {
  name: "left handed behavior object spare head"
  id: 38
}
item {
  name: "left handed behavior object electronic equipment"
  id: 39
}
item {
  name: "left handed behavior object electronic pointing at others"
  id: 40
}
item {
  name: "left handed behavior object chalk"
  id: 41
}
item {
  name: "left handed behavior object no interaction"
  id: 42
}
item {
  name: "right hand invisible"
  id: 43
}
item {
  name: "right hand other"
  id: 44
}
item {
  name: "right hand palm grip"
  id: 45
}
item {
  name: "right hand palm spread"
  id: 46
}
item {
  name: "right hand palm Point"
  id: 47
}
item {
  name: "right hand applause"
  id: 48
}
item {
  name: "right hand write"
  id: 49
}
item {
  name: "right arm invisible"
  id: 50
}
item {
  name: "right arm other"
  id: 51
}
item {
  name: "right arm flat"
  id: 52
}
item {
  name: "right arm droop"
  id: 53
}
item {
  name: "right arm forward"
  id: 54
}
item {
  name: "right arm flexion"
  id: 55
}
item {
  name: "right arm raised"
  id: 56
}
item {
  name: "right handed behavior object invisible"
  id: 57
}
item {
  name: "right handed behavior object other"
  id: 58
}
item {
  name: "right handed behavior object book "
  id: 59
}
item {
  name: "right handed behavior object exercise book"
  id: 60
}
item {
  name: "right handed behavior object spare head"
  id: 61
}
item {
  name: "right handed behavior object electronic equipment"
  id: 62
}
item {
  name: "right handed behavior object electronic pointing at others"
  id: 63
}
item {
  name: "right handed behavior object chalk"
  id: 64
}
item {
  name: "leg invisible"
  id: 65
}
item {
  name: "leg other"
  id: 66
}
item {
  name: "leg stand"
  id: 67
}
item {
  name: "leg run"
  id: 68
}
item {
  name: "leg walk"
  id: 69
}
item {
  name: "leg jump"
  id: 70
}
item {
  name: "leg Kick"
  id: 71
}

dense_proposals_train.pkl

cd /home/MPCLST/
cp ./yolovDeepsort/mywork/avaMin_dense_proposals_train.pkl ./Dataset/Dataset01/annotations/dense_proposals_train.pkl

rawframes
在取名上，裁剪的视频帧存在与训练不匹配的问题，所以需要对/home/Dataset/frames中的图片进行名字修改
例如:
原本的名字：rawframes/1/1_000001.jpg
目标名字：rawframes/1/img_00001.jpg

cp -r /home/MPCLST/Dataset/Dataset01/frames/* /home/MPCLST/Dataset/Dataset01/rawframes
cd /home/MPCLST/Dataset/
python change_raw_frames.py --DatasetXX_dir Dataset01

标注文件修正
有部分的标注文件在字段类型上有些问题
所以需要修正

dense_proposals_train

cd /home/MPCLST/Dataset
python change_dense_proposals.py --DatasetXX_dir Dataset01 --dense_proposals_dir dense_proposals_train.pkl

dense_proposals_val

cd /home/MPCLST/Dataset
python change_dense_proposals.py --DatasetXX_dir Dataset01 --dense_proposals_dir dense_proposals_val.pkl

mmaction2 安装

cd /home

git clone https://gitee.com/YFwinston/mmaction2_YF.git

pip install mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html

pip install opencv-python-headless==4.1.2.30

pip install moviepy

cd mmaction2_YF
pip install -r requirements/build.txt
pip install -v -e .
mkdir -p ./data/ava

cd ..
git clone https://gitee.com/YFwinston/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e .

cd ../mmaction2_YF

wget https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_2x_coco/faster_rcnn_r50_fpn_2x_coco_bbox_mAP-0.384_20200504_210434-a5d8aa15.pth -P ./Checkpionts/mmdetection/

wget https://download.openmmlab.com/mmaction/recognition/slowfast/slowfast_r50_8x8x1_256e_kinetics400_rgb/slowfast_r50_8x8x1_256e_kinetics400_rgb_20200716-73547d2b.pth -P ./Checkpionts/mmaction/

训练与测试
配置文件

cd /home/mmaction2_YF/configs/detection/ava/
touch my_slowfast_kinetics_pretrained_r50_4x16x1_20e_ava_rgb.py

# model setting
model = dict(
    type='FastRCNN',
    backbone=dict(
        type='ResNet3dSlowFast',
        pretrained=None,
        resample_rate=8,
        speed_ratio=8,
        channel_ratio=8,
        slow_pathway=dict(
            type='resnet3d',
            depth=50,
            pretrained=None,
            lateral=True,
            conv1_kernel=(1, 7, 7),
            dilations=(1, 1, 1, 1),
            conv1_stride_t=1,
            pool1_stride_t=1,
            inflate=(0, 0, 1, 1),
            spatial_strides=(1, 2, 2, 1)),
        fast_pathway=dict(
            type='resnet3d',
            depth=50,
            pretrained=None,
            lateral=False,
            base_channels=8,
            conv1_kernel=(5, 7, 7),
            conv1_stride_t=1,
            pool1_stride_t=1,
            spatial_strides=(1, 2, 2, 1))),
    roi_head=dict(
        type='AVARoIHead',
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor3D',
            roi_layer_type='RoIAlign',
            output_size=8,
            with_temporal_pool=True),
        bbox_head=dict(
            type='BBoxHeadAVA',
            in_channels=2304,
            num_classes=81,
            multilabel=True,
            dropout_ratio=0.5)),
    train_cfg=dict(
        rcnn=dict(
            assigner=dict(
                type='MaxIoUAssignerAVA',
                pos_iou_thr=0.9,
                neg_iou_thr=0.9,
                min_pos_iou=0.9),
            sampler=dict(
                type='RandomSampler',
                num=32,
                pos_fraction=1,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=1.0,
            debug=False)),
    test_cfg=dict(rcnn=dict(action_thr=0.002)))

dataset_type = 'AVADataset'
data_root = '/home/MPCLST/Dataset/Dataset01/rawframes'
anno_root = '/home/MPCLST/Dataset/Dataset01/annotations'


#ann_file_train = f'{anno_root}/ava_train_v2.1.csv'
ann_file_train = f'{anno_root}/train.csv'
#ann_file_val = f'{anno_root}/ava_val_v2.1.csv'
ann_file_val = f'{anno_root}/val.csv'

#exclude_file_train = f'{anno_root}/ava_train_excluded_timestamps_v2.1.csv'
#exclude_file_val = f'{anno_root}/ava_val_excluded_timestamps_v2.1.csv'

exclude_file_train = f'{anno_root}/train_excluded_timestamps.csv'
exclude_file_val = f'{anno_root}/val_excluded_timestamps.csv'

#label_file = f'{anno_root}/ava_action_list_v2.1_for_activitynet_2018.pbtxt'
label_file = f'{anno_root}/action_list.pbtxt'

proposal_file_train = (f'{anno_root}/dense_proposals_train.pkl')
proposal_file_val = f'{anno_root}/dense_proposals_val.pkl'

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_bgr=False)

train_pipeline = [
    dict(type='SampleAVAFrames', clip_len=32, frame_interval=2),
    dict(type='RawFrameDecode'),
    dict(type='RandomRescale', scale_range=(256, 320)),
    dict(type='RandomCrop', size=256),
    dict(type='Flip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='FormatShape', input_format='NCTHW', collapse=True),
    # Rename is needed to use mmdet detectors
    dict(type='Rename', mapping=dict(imgs='img')),
    dict(type='ToTensor', keys=['img', 'proposals', 'gt_bboxes', 'gt_labels']),
    dict(
        type='ToDataContainer',
        fields=[
            dict(key=['proposals', 'gt_bboxes', 'gt_labels'], stack=False)
        ]),
    dict(
        type='Collect',
        keys=['img', 'proposals', 'gt_bboxes', 'gt_labels'],
        meta_keys=['scores', 'entity_ids'])
]
# The testing is w/o. any cropping / flipping
val_pipeline = [
    dict(type='SampleAVAFrames', clip_len=32, frame_interval=2),
    dict(type='RawFrameDecode'),
    dict(type='Resize', scale=(-1, 256)),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='FormatShape', input_format='NCTHW', collapse=True),
    # Rename is needed to use mmdet detectors
    dict(type='Rename', mapping=dict(imgs='img')),
    dict(type='ToTensor', keys=['img', 'proposals']),
    dict(type='ToDataContainer', fields=[dict(key='proposals', stack=False)]),
    dict(
        type='Collect',
        keys=['img', 'proposals'],
        meta_keys=['scores', 'img_shape'],
        nested=True)
]

data = dict(
    #videos_per_gpu=9,
    #workers_per_gpu=2,
    videos_per_gpu=5,
    workers_per_gpu=2,
    val_dataloader=dict(videos_per_gpu=1),
    test_dataloader=dict(videos_per_gpu=1),
    train=dict(
        type=dataset_type,
        ann_file=ann_file_train,
        exclude_file=exclude_file_train,
        pipeline=train_pipeline,
        label_file=label_file,
        proposal_file=proposal_file_train,
        person_det_score_thr=0.9,
        data_prefix=data_root,
        start_index=1,),
    val=dict(
        type=dataset_type,
        ann_file=ann_file_val,
        exclude_file=exclude_file_val,
        pipeline=val_pipeline,
        label_file=label_file,
        proposal_file=proposal_file_val,
        person_det_score_thr=0.9,
        data_prefix=data_root,
        start_index=1,))
data['test'] = data['val']

#optimizer = dict(type='SGD', lr=0.1125, momentum=0.9, weight_decay=0.00001)
optimizer = dict(type='SGD', lr=0.0125, momentum=0.9, weight_decay=0.00001)
# this lr is used for 8 gpus

optimizer_config = dict(grad_clip=dict(max_norm=40, norm_type=2))
# learning policy

lr_config = dict(
    policy='step',
    step=[10, 15],
    warmup='linear',
    warmup_by_epoch=True,
    warmup_iters=5,
    warmup_ratio=0.1)
#total_epochs = 20
total_epochs = 100
checkpoint_config = dict(interval=1)
workflow = [('train', 1)]
evaluation = dict(interval=1, save_best='mAP@0.5IOU')
log_config = dict(
    interval=20, hooks=[
        dict(type='TextLoggerHook'),
    ])
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = ('./work_dirs/ava/'
            'slowfast_kinetics_pretrained_r50_4x16x1_20e_ava_rgb')
load_from = ('https://download.openmmlab.com/mmaction/recognition/slowfast/'
             'slowfast_r50_4x16x1_256e_kinetics400_rgb/'
             'slowfast_r50_4x16x1_256e_kinetics400_rgb_20200704-bcde7ed7.pth')
resume_from = None
find_unused_parameters = False

训练

cd /home/mmaction2_YF
python tools/train.py configs/detection/ava/my_slowfast_kinetics_pretrained_r50_4x16x1_20e_ava_rgb.py --validate

测试
首先，创建新的label_map
cd /home/mmaction2_YF/tools/data/ava
touch label_map2.txt

label_map2.txt内容如下：

1: talk
2: bow
3: stand
4: sit
5: walk
6: hand up
7: catch

1: eye Invisible
2: eye ohter
3: open eyes
4: close eyes
5: lip Invisible
6: lip ohter
7:open mouth
8: close mouth
9: body Invisible
10: body other
11: body sit
12: body side Sit
13: body stand
14: body lying down
15: body bend over
16: body squat
17: body rely
18: body lie flat
19: body lateral
20: left hand invisible
21: left hand other
22: left hand palm grip
23: left hand palm spread
24: left hand palm point
25: left hand applause
26: left hand write
27: left arm invisible
28: left arm other
29: left arm flat
30: left arm droop
31: left arm forward
32: left arm flexion
33: left arm raised
34: left handed behavior object invisible
35: left handed behavior object other
36: left handed behavior object book
37: left handed behavior object exercise book
38: left handed behavior object spare head
39: left handed behavior object electronic equipment
40: left handed behavior object electronic pointing at others
41: left handed behavior object chalk
42: left handed behavior object no interaction
43: right hand invisible
44: right hand other
45: right hand palm grip
46: right hand palm spread
47: right hand palm Point
48: right hand applause
49: right hand write
50: right arm invisible
51: right arm other
52: right arm flat
53: right arm droop
54: right arm forward
55: right arm flexion
56: right arm raised
57: right handed behavior object invisible
58: right handed behavior object other
59: right handed behavior object book
60: right handed behavior object exercise book
61: right handed behavior object spare head
62: right handed behavior object electronic equipment
63: right handed behavior object electronic pointing at others
64: right handed behavior object chalk
65: leg invisible
66: leg other
67: leg stand
68: leg run
69: leg walk
70: leg jump
71: leg Kick

一个sh脚本同时运行多个sh脚本：https://www.cnblogs.com/Pan-xi-yi/p/12053276.html