环境准备
1、python环境和需要的依赖安装,和TensorFlow
2、利用GPU加速训练,用到cuda与cudnn。
3、labelme下载:
也可以:安装好python环境和pip,之后使用命令 pip install labelme 。
如果使用anaconda,同样conda install labelme。
数据准备
1、数据标注
2、标注后得到的json文件解析
找到lablme生成后json文件,需要将其转化成下图所示,一共包含五个不同文件。我用的批量转换方式,如下代码。将对应的路径修改为自己的json路径和待保存路径。
#!/usr/bin/env python
# _*_ coding: UTF-8 _*_
#!/bin/bash
'''对指定路径中的json文件进行解析,生成相应的数据'''
import os
import natsort
labelme_json = "G:\\anaconda\\lb\Scripts\labelme_json_to_dataset.exe" #labelme_json_to_dataset.exe 程序路径
file_path = "G:\Mask_RCNN-master\\train_data\json" # 处理文件所在路径
dir_info = os.listdir(file_path)
dir_info = natsort.natsorted(dir_info)
"""循环处理‘.json’文件"""
for file_name in dir_info:
file_name = os.path.join(file_path + "\\" + file_name)
os.system('cd G:\Mask_RCNN-master\TNISD_300a\\fan_json\\')
os.system(labelme_json + " " + file_name)
注:有些情况下,生成的label图片会出现全黑,正常情况。
由于labelme生成的掩码标签 label.png为16位存储,opencv默认读取8位,需要将16位转8位,可通过C++程序转化,代码请参考这篇博文:http://blog.youkuaiyun.com/l297969586/article/details/79154150
3、各个.py文件()
修改config.py
"""
Mask R-CNN
Base Configurations class.
Copyright (c) 2017 Matterport, Inc.
Licensed under the MIT License (see LICENSE for details)
Written by Waleed Abdulla
"""
import numpy as np
class Config(object):
"""Base configuration class. For custom configurations, create a
sub-class that inherits from this one and override properties
that need to be changed.
"""
# experiment is running.
NAME = None # Override in sub-classes
GPU_COUNT = 1 #使用的GPU核心数
IMAGES_PER_GPU = 1
# a lot of time on validation stats.
STEPS_PER_EPOCH = 1000
# down the training.
VALIDATION_STEPS = 50
BACKBONE = "resnet50" #使用resnet101,若是出现现存溢出情况,则可以修改为50
COMPUTE_BACKBONE_SHAPE = None
# The strides of each layer of the FPN Pyramid. These values
# are based on a Resnet101 backbone.
BACKBONE_STRIDES = [4, 8, 16, 32, 64]
# Size of the fully-connected layers in the classification graph
FPN_CLASSIF_FC_LAYERS_SIZE = 1024
# Size of the top-down layers used to build the feature pyramid
TOP_DOWN_PYRAMID_SIZE = 256
# Number of classification classes (including background)
NUM_CLASSES = 1 # Override in sub-classes
# Length of square anchor side in pixels
RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)
# Ratios of anchors at each cell (width/height)
# A value of 1 represents a square anchor, and 0.5 is a wide anchor
RPN_ANCHOR_RATIOS = [0.5, 1, 2]
# If 2, then anchors are created for every other cell, and so on.
RPN_ANCHOR_STRIDE = 1
# Non-max suppression threshold to filter RPN proposals.
# You can increase this during training to generate more propsals.
RPN_NMS_THRESHOLD = 0.7
# How many anchors per image to use for RPN training
RPN_TRAIN_ANCHORS_PER_IMAGE = 256
# ROIs kept after tf.nn.top_k and before non-maximum suppression
PRE_NMS_LIMIT = 6000
# ROIs kept after non-maximum suppression (training and inference)
POST_NMS_ROIS_TRAINING = 2000
POST_NMS_ROIS_INFERENCE = 1000
# If enabled, resizes instance masks to a smaller size to reduce
# memory load. Recommended when using high-resolution images.
USE_MINI_MASK = True
MINI_MASK_SHAPE = (56, 56) # (height, width) of the mini-mask
# Input image resizing
IMAGE_RESIZE_MODE = "square"
IMAGE_MIN_DIM = 800 #图片的最大最小值,根据自己的训练集大小来定
IMAGE_MAX_DIM = 1024
# Minimum scaling ratio. Checked after MIN_IMAGE_DIM and can force further
IMAGE_MIN_SCALE = 0
# Number of color channels per image. RGB = 3, grayscale = 1, RGB-D = 4
IMAGE_CHANNEL_COUNT = 3
# Image mean (RGB)
MEAN_PIXEL = np.array([123.7, 116.8, 103.9])
# the RPN NMS threshold.
TRAIN_ROIS_PER_IMAGE = 200
# Percent of positive ROIs used to train classifier/mask heads
ROI_POSITIVE_RATIO = 0.33
# Pooled ROIs
POOL_SIZE = 7
MASK_POOL_SIZE = 14
# Shape of output mask
# To change this you also need to change the neural network mask branch
MASK_SHAPE = [28, 28]
# Maximum number of ground truth instances to use in one image
MAX_GT_INSTANCES = 100
# Bounding box refinement standard deviation for RPN and final detections.
RPN_BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])
BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])
# Max number of final detections
DETECTION_MAX_INSTANCES = 100
# ROIs below this threshold are skipped
DETECTION_MIN_CONFIDENCE = 0.7
# Non-maximum suppression threshold for detection
DETECTION_NMS_THRESHOLD = 0.3
# implementation.
LEARNING_RATE =