Fasterrcnn代码个人精细解读（先验框生成部分）

chanruoburuo

已于 2023-03-26 23:04:24 修改

阅读量261

点赞数

文章标签： python 机器学习目标检测人工智能深度学习

于 2023-03-26 22:56:38 首次发布

本文链接：https://blog.youkuaiyun.com/chanruoburuo/article/details/129786294

版权

import numpy as np
def generate_anchor_base(base_size=16, ratios=[0.5, 1, 2], scales=[8, 16, 32]):·生成如图各点处的3种长宽比、3种大小（长宽几何均值为size*scale）的9个先验框，如此长宽可由h*w
=(ratio*w)*w=ratio*(w^2)等于(b*s)^2，得w=bs*√(1/r)，h=ratio*w=bs*√r，如下循环即得
    anchor_base = np.zeros(9, 4)·九行四列的空矩阵以存九个框中主角点(左下+右上)坐标
    for i in range(len(ratios)):
        for j in range(len(scales)):
index = i * len(scales) + j·先以i,j值即[0,1,2]*3+[0,1,2]为九个框偏上序号
            h = base_size * scales[j] * np.sqrt(ratios[i])
            w = base_size * scales[j] * np.sqrt(1. / ratios[i])
            anchor_base[index, 0] = - h / 2.
            anchor_base[index, 1] = - w / 2.
            anchor_base[index, 2] = h / 2.
            anchor_base[index, 3] = w / 2.
return anchor_base ·即生成相对当前点的先验框角点坐标，但最终需要关于图片（此处是受采后特征图）原点的各点处的各框坐标，故再定义函数

def _enumerate_shifted_anchor(anchor_base, feat_stride, height, width):
·stride即经4次pooling所得的缩小的倍数即16
    shift_x = np.arange(0, width * feat_stride, feat_stride)·特征图上各点在原图的坐标
    shift_y = np.arange(0, height * feat_stride, feat_stride)
    shift_x, shift_y = np.meshgrid(shift_x, shift_y)·meshgrid用于在指定横纵坐标向量下生成两者搭配组成的网格点，如图中x=[1,2],y=[3,4]，则x竖向扩展为[[1,2],[1,2]]，y则先转列矢[[3],[4]]再横向扩展为[[3,3],[4,4]]，这样X,Y矩阵为网格点的横纵坐标矩，组合即得完整坐标
    shift = np.stack((shift_x.ravel(), shift_y.ravel(), shift_x.ravel(), shift_y.ravel(),), axis=1)·但是这样还不能真正转为坐标集的形式，可配合拉平函数ravel将X,Y矩阵按阅读的顺序拉成列向量并按列(axis=1)堆叠，实战中为[[ 0 0 0 0],[ 16 0 16 0],..[592 592 592 592]]即先定Y再变X，即生成受采的原图网格坐标集如图，这样将各点坐标与点原框坐标叠加即得图中产生的各框坐标如下
    A  = anchor_base.shape[0]·每个点的先验框数量为9
K  = shift.shape[0]·总的特征图上的点数亦采样的点数为38*38=1444
·现在问题在于如何叠加(A,4)和(K,4)，可以用循环遍历的方法，但在numpy中高低维数组相加低维自动升维如[[1,1,1],[2,2,2]]+1=[[2 2 2], [3 3 3]]，同维异度相加低度（多为1）自动以复制的形式补位成高度如[1,2,3]+[[1],[2],[3],[4],[5]]=[[2 3 4] [3 4 5] [4 5 6] [5 6 7] [6 7 8]]，这样可以为(A,4)和(K,4)错位添维成(1, A, 4)和(K, 1, 4)，相加补位得(K, A, 4)正是叠加的结果：
    anchor  = anchor_base.reshape((1, A, 4)) + shift.reshape((K, 1, 4))
    anchor  = anchor.reshape((K * A, 4)).astype(np.float32)·再reshape回正常的矩阵即可
    return anchor·得到基于原图的各点处九个先验框的坐标，一共1444*9个
    
if __name__ == "__main__":·此句表示单独运行当前文件时此段作主程序运行且不做可调函
    import matplotlib.pyplot as plt·此段就是画出单点处先验框的示意图
    anchors_all= _enumerate_shifted_anchor(generate_anchor_base(), 16, 38, 38)
    fig = plt.figure()·用Python的matplotlib库创建一个新的图形对象
    ax = fig.add_subplot(111)·若（234）表示生成的图在2*3矩阵中的第4位置
    plt.ylim(-300,900); plt.xlim(-300,900)·指定显示范围
    shift_x = np.arange(0, width * feat_stride, feat_stride)
    shift_y = np.arange(0, height * feat_stride, feat_stride)
    shift_x, shift_y = np.meshgrid(shift_x, shift_y)
    plt.scatter(shift_x,shift_y)·scatter即将两矩同位素作横纵坐标以画其散点图
    box_widths = anchors_all[:,2]-anchors_all[:,0]
    box_heights = anchors_all[:,3]-anchors_all[:,1]·算矩宽高
    for i in [108, 109, 110, 111, 112, 113, 114, 115, 116]:
        rect=plt.Rectangle([anchors_all[i,0],anchors_all[i, 1]],box_widths[i],box_heights[i],
color="r",fill=False)·首参为矩形左上角点坐标（整个粗体所示），后面依次为宽高
        ax.add_patch(rect)·向子图subplot(111)添加矩形框
    plt.show()