（14）目标检测_SSD训练代码基于pytorch搭建代码

SSD目标检测：PyTorch实现训练代码解析

原创已于 2022-10-17 09:00:58 修改 · 811 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#目标检测 #pytorch #深度学习

于 2022-10-16 09:41:37 首次发布

torch 专栏收录该内容

32 篇文章

订阅专栏

本文详细介绍了使用PyTorch搭建SSD目标检测模型的训练过程，包括主要参数设置，如预训练模型使用、训练策略等，并讲解了获取类名、计算anchor、权重迁移以及损失函数等关键函数的实现细节。

部署运行你感兴趣的模型镜像

1、主要参数

（1）anchor有两种，如果小目标

anchors_size = [21, 45, 99, 153, 207, 261, 315]

如果一般目标

anchors_size    = [30, 60, 111, 162, 213, 264, 315]

（2）采用预训练模型，使用adam

前50步冻结训练，50步开始不冻结训练

（二）从主干网络的预训练权重开始训练：
    #       Adam：
    #           Init_Epoch = 0，Freeze_Epoch = 50，UnFreeze_Epoch = 100，Freeze_Train = True，optimizer_type = 'adam'，Init_lr = 6e-4，weight_decay = 0。（冻结）
    #           Init_Epoch = 0，UnFreeze_Epoch = 100，Freeze_Train = False，optimizer_type = 'adam'，Init_lr = 6e-4，weight_decay = 0。（不冻结）

其中Freeze_Epoch为模型冻结训练的Freeze_Epoch，UnFreeze_Epoch为模型总共训练的epoch

（3）设置了pretrained=true，则自动下载预训练权重，再使用。

如果使用预训练模型，建议model_path直接设置使用的路径

model_path      = 'model_data/ssd_weights.pth'

    #----------------------------------------------------------------------------------------------------------------------------#
    #   pretrained      是否使用主干网络的预训练权重，此处使用的是主干的权重，因此是在模型构建的时候进行加载的。
    #                   如果设置了model_path，则主干的权值无需加载，pretrained的值无意义。
    #                   如果不设置model_path，pretrained = True，此时仅加载主干开始训练。
    #                   如果不设置model_path，pretrained = False，Freeze_Train = Fasle，此时从0开始训练，且没有冻结主干的过程。
    #----------------------------------------------------------------------------------------------------------------------------#

2、几个函数

2.1 获取类名

（1）因为训练中的分类包括了背景，所以分类数+1

    class_names, num_classes = get_classes(classes_path)
    num_classes += 1

（2）分类函数的代码如下：

#---------------------------------------------------#
#   获得类
#---------------------------------------------------#
def get_classes(classes_path):
    with open(classes_path, encoding='utf-8') as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names, len(class_names)

其中“strip() 方法用于移除字符串 “首尾” 指定的字符(默认为空格或换行符)或字符序列。

2.2 获取anchor

anchors = get_anchors(input_shape, anchors_size, backbone)

（1）此处的：

input_shape = [300, 300]

anchors_size = [30, 60, 111, 162, 213, 264, 315]

backbone = "vgg"

（2）具体的get_anchors实现函数如下

def get_anchors(input_shape = [300,300], anchors_size = [30, 60, 111, 162, 213, 264, 315], backbone = 'vgg'):
    if backbone == 'vgg':
        feature_heights, feature_widths = get_vgg_output_length(input_shape[0], input_shape[1])
        aspect_ratios = [[1, 2], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2], [1, 2]]
    else:
        feature_heights, feature_widths = get_mobilenet_output_length(input_shape[0], input_shape[1])
        aspect_ratios = [[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]
        
    anchors = []
    for i in range(len(feature_heights)):
        anchor_boxes = AnchorBox(input_shape, anchors_size[i], max_size = anchors_size[i+1], 
                    aspect_ratios = aspect_ratios[i]).call([feature_heights[i], feature_widths[i]])
        anchors.append(anchor_boxes)

    anchors = np.concatenate(anchors, axis=0)
    return anchors

（3）其中函数get_vgg_output_length的定义如下

#---------------------------------------------------#
#   用于计算共享特征层的大小
#---------------------------------------------------#
def get_vgg_output_length(height, width):
    filter_sizes    = [3, 3, 3, 3, 3, 3, 3, 3]
    padding         = [1, 1, 1, 1, 1, 1, 0, 0]
    stride          = [2, 2, 2, 2, 2, 2, 1, 1]
    feature_heights = []
    feature_widths  = []

    for i in range(len(filter_sizes)):
        height  = (height + 2*padding[i] - filter_sizes[i]) // stride[i] + 1
        width   = (width + 2*padding[i] - filter_sizes[i]) // stride[i] + 1
        feature_heights.append(height)
        feature_widths.append(width)
    return np.array(feature_heights)[-6:], np.array(feature_widths)[-6:]

得到的feature_heights, feature_widths为

[38 19 10 5 3 1]

(4)本2.2小章节最后得到的anchor数量

(8732, 4)

2.2 将预训练模型中的权重提取处理并更新模型

其中model = SSD300(num_classes, backbone, pretrained)是自己创建的新模型

pretrained_dict = torch.load(model_path, map_location = device)是加载的权重

        #------------------------------------------------------#
        #   根据预训练权重的Key和模型的Key进行加载
        #------------------------------------------------------#
        model_dict      = model.state_dict()
        pretrained_dict = torch.load(model_path, map_location = device)
        load_key, no_load_key, temp_dict = [], [], {}
        for k, v in pretrained_dict.items():
            if k in model_dict.keys() and np.shape(model_dict[k]) == np.shape(v):
                temp_dict[k] = v
                load_key.append(k)
            else:
                no_load_key.append(k)
        model_dict.update(temp_dict)
        model.load_state_dict(model_dict)

2.3 损失函数

可以看到，背景的id为0， background_label_id=0

class MultiboxLoss(nn.Module):
    def __init__(self, num_classes, alpha=1.0, neg_pos_ratio=3.0,
                 background_label_id=0, negatives_for_hard=100.0):

您可能感兴趣的与本文相关的镜像