FCOS阅读笔记

1.对FCOS理解

将backbone和FPN输出得到的feature map中的每个像素点作为训练样本,如果该像素点的位置(根据stride映射回输入图像)在某个GT内部,则视为正样本,该样本的类别标签对应为GT的类别标签。

2.FCOS中regression分支target(label)生成过程

说明:batch size=4
 def compute_targets_for_locations(self, locations, targets, object_sizes_of_interest):
        labels = []   #输出1:像素样本的类别标签,
        reg_targets = []    #输出2:fcos回归分支输出的target
        xs, ys = locations[:, 0], locations[:, 1]    #locations:feature map上每个像素点映射回原图的坐标,size=(18819,2)

        for im_i in range(len(targets)):    #targets:GT
            targets_per_im = targets[im_i]    #以第一张输入图像为例,含有19个GT
            assert targets_per_im.mode == "xyxy"    
            bboxes = targets_per_im.bbox    #每个GT的左上角和右下角的顶点坐标
            labels_per_im = targets_per_im.get_field("labels")    ##每个GT的类被标签
            area = targets_per_im.area()    #每个GT的面积
            
			#每个像素在多个尺度上的target(label):[ l, t, r, b ]
            l = xs[:, None] - bboxes[:, 0][None]    #size=(18819,19)
            t = ys[:, None] - bboxes[:, 1][None]    #[ None ]:增加一个维度
            r = bboxes[:, 2][None] - xs[:, None]
            b = bboxes[:, 3][None] - ys[:, None]    
            reg_targets_per_im = torch.stack([l, t, r, b], dim=2)    #size=(18819,19, 4)

            is_in_boxes = reg_targets_per_im.min(dim=2)[0] > 0    #仅保留落在GT内部的像素样本,size=(18819,19)
            
			#区间为:[-1, 64],[64,128],[128,256],[256,512],[512,INF]
            max_reg_targets_per_im = reg_targets_per_im.max(dim=2)[0]
            # limit the regression range for each location
            is_cared_in_the_level = \    #仅保留落在指定区间范围内的样本,size=(18819,19)
                (max_reg_targets_per_im >= object_sizes_of_interest[:, [0]]) & \
                (max_reg_targets_per_im <= object_sizes_of_interest[:, [1]])
                
			#计算所有样本的面积,为后面解决一个样本对应多个GT做准备
			#对应5个不同尺度的feature map,共计18819个样本,每个样本都会与19个GT进行比较,因此,存在着大量不符合要求的样本,需要对其进行筛选,筛选的原则:
            locations_to_gt_area = area[None].repeat(len(locations), 1)    #area.size=(19, locations_to_gt_area.size=                         (18819,19)
            locations_to_gt_area[is_in_boxes == 0] = INF    #去掉不在GT内的样本的面积
            locations_to_gt_area[is_cared_in_the_level == 0] = INF    #去掉不在指定区间范围的样本的面积

            # if there are still more than one objects for a location,
            # we choose the one with minimal area
            locations_to_min_area, locations_to_gt_inds = locations_to_gt_area.min(dim=1)    #对于具有多个类别的样本,选择面积最小面积对应的类别作为样本的类别标签

            reg_targets_per_im = reg_targets_per_im[range(len(locations)), locations_to_gt_inds]    #[l, t, r, b] 标签赋值
            labels_per_im = labels_per_im[locations_to_gt_inds]    #类别标签幅值
            labels_per_im[locations_to_min_area == INF] = 0

            labels.append(labels_per_im)
            reg_targets.append(reg_targets_per_im)

        return labels, reg_targets
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值