训练自己的实例分割模型

这篇博客介绍了如何利用YOLACT模型进行实时实例分割。作者提供了模型的高速性能和训练过程,包括数据集的准备、转换为COCO格式、安装步骤以及评估模型的方法。还分享了自定义数据集的制作流程,使用labelme工具进行标注,并给出了训练指令。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

注:2019年04月05日刚出炉的paper

Abstract:我们提出了一个用于实时实例分割的简单全卷积模型,在单个Titan Xp上以33 fps在MS COCO上实现了29.8 mAP,这比以前的任何算法都要快得多。此外,我们只在一个GPU上训练后获得此结果。我们通过将实例分割分成两个并行子任务:(1)生成一组原型掩膜(prototype mask);(2)预测每个实例的掩膜系数(mask coefficients)。然后我们通过将原型与掩模系数线性组合来生成实例掩膜(instance masks)。我们发现因为这个过程不依赖于 repooling,所以这种方法可以产生非常高质量的掩模。此外,我们分析了 the emergent behavior of our prototypes,并表明他们学会以 translation variant manner 定位实例,尽管是完全卷积的。最后,我们还提出了快速NMS(Fast NMS),比标准NMS快12 ms,只有一点点性能损失。

paper:yolact

github:yolact

效果还可以。

 

 

下面是实现过程:

Installation

  • Set up a Python3 environment.
  • Install Pytorch 1.0.1 (or higher) and TorchVision.
  • Install some other packages:
    # Cython needs to be installed before pycocotools
    pip install cython
    pip install opencv-python pillow pycocotools matplotlib 
  • Clone this repository and enter it:
    git clone https://github.com/dbolya/yolact.git
    cd yolact
  • If you'd like to train YOLACT, download the COCO dataset and the 2014/2017 annotations. Note that this script will take a while and dump 21gb of files into ./data/coco.
    sh data/scripts/COCO.sh
  • If you'd like to evaluate YOLACT on test-dev, download test-dev with this script.
    sh data/scripts/COCO_test.sh

安装参考源码给的步骤没问题。数据集下载按照coco.sh里下载,我是自行抠出下载地址然后使用wget下载的。

这里放一个百度网盘下载链接供大家使用, 提取码: 2tw3 

Evaluation

As of April 5th, 2019 here are our latest models along with their FPS on a Titan Xp and mAP on test-dev:

Image Size Backbone FPS mAP Weights  
550 Resnet50-FPN 42.5 28.2 yolact_resnet50_54_800000.pth Mirror
550 Darknet53-FPN 40.0 28.7 yolact_darknet53_54_800000.pth Mirror
550 Resnet101-FPN 33.0 29.8 yolact_base_54_800000.pth Mirror
700 Resnet101-FPN 23.6 31.2 yolact_im700_54_800000.pth Mirror

 

To evalute the model, put the corresponding weights file in the ./weights directory and run one of the following commands.

 在整个数据集做评估:

python3 eval.py --trained_model=weights/yolact_base_54_800000.pth  --dataset=coco2017_dataset

单张图片进行测试:

python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png

其他用法汇总如下:

Quantitative Results on COCO
# Quantitatively evaluate a trained model on the entire validation set. Make sure you have COCO downloaded as above.
# This should get 29.92 validation mask mAP last time I checked.
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth  --dataset=coco2017_dataset

# Output a COCOEval json to submit to the website or to use the run_coco_eval.py script.
# This command will create './results/bbox_detections.json' and './results/mask_detections.json' for detection and instance segmentation respectively.
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json

# You can run COCOEval on the files created in the previous command. The performance should match my implementation in eval.py.
python run_coco_eval.py

# To output a coco json file for test-dev, make sure you have test-dev downloaded from above and go
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --output_coco_json --dataset=coco2017_testdev_dataset


# Display qualitative results on COCO. From here on I'll use a confidence threshold of 0.3.
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --dataset=coco2017_dataset --score_threshold=0.3 --top_k=100 --display



Benchmarking on COCO
# Run just the raw model on the first 1k images of the validation set
python eval.py --trained_model=weights/yolact_base_54_800000.pth --benchmark --max_images=1000



Images
# Display qualitative results on the specified image.
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=my_image.png

# Process an image and save it to another file.
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png

# Process a whole folder of images.
python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --images=path/to/input/folder:path/to/output/folder

eg:python3 eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --images=data/test:data/results


# Display a video in real-time. "--video_multiframe" will process that many frames at once for improved performance.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --video=my_video.mp4

# Display a webcam feed in real-time. If you have multiple webcams pass the index of the webcam you want instead of 0.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video_multiframe=2 --video=0

# Process a video and save it to another file. This is unoptimized.
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --top_k=100 --video=input_video.mp4:output_video.mp4



## training
# Trains using the base config with a batch size of 8 (the default).
python train.py --config=yolact_base_config

# Trains yolact_base_config with a batch_size of 5. For the 550px models, 1 batch takes up around 1.5 gigs of VRAM, so specify accordingly.
python3 train.py --config=yolact_base_config --batch_size=5

# Resume training yolact_base with a specific weight fil
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值