【目标检测算法实现系列】Keras实现Faster R-CNN算法（三）_np.meshgrid(np.arange(cols), np.arange(rows))-优快云博客

本文链接：https://blog.youkuaiyun.com/u010057965/article/details/100774292

本文是Keras实现Faster R-CNN目标检测算法系列的第三篇，详细介绍了从RPN网络到ROIPooling层的实现过程，包括应用回归预测调整anchor位置、非极大值抑制选择有效anchor，以及构造精分类和精回归的训练数据的步骤。后续将进行模型训练和预测。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

【目标检测算法实现系列】Keras实现Faster R-CNN算法（一）

【目标检测算法实现系列】Keras实现Faster R-CNN算法（二）

在此之前，我们主要实现了相关数据的解析，预处理等准备工作，以及对应Faster RCNN的相关网络模块搭建。接下来我们接着实现其他部分。

一、从RPN网络到ROIPooling层

在上一篇中，我们实现了一个自定义的ROIPooling层，这次我们看下如何建立RPN与ROIpool层之间的联系。下面，我们看下如何代码实现，通过RPN网络的输出，来指定对应ROIPing层的输入。

def rpn_to_roi(rpn_cls_layer, rpn_regr_layer, C, use_regr=True, max_boxes=300,overlap_thresh=0.9):
    '''
    建立rpn网络与roi pooling层的连接
    通过rpn网络的输出，找出对应的roi
    :param rpn_cls_layer:  rpn网络的分类输出
    :param rpn_regr_layer:  rpn网络的回归输出
    :param C:
    :param dim_ordering:
    :param use_regr:
    :param max_boxes:
    :param overlap_thresh:
    :return:
    '''
    regr_layer = rpn_regr_layer / C.std_scaling

    anchor_sizes = C.anchor_box_scales
    anchor_ratios = C.anchor_box_ratios

    assert rpn_cls_layer.shape[0] == 1
    (rows, cols) = rpn_cls_layer.shape[1:3]

    curr_layer = 0
    # A.shape = (4个在feature_map上的对应位置信息（左上角和右下角坐标）， feature_map_height, feature_map_wigth, k(9))
    A = np.zeros((4, rpn_cls_layer.shape[1], rpn_cls_layer.shape[2], rpn_cls_layer.shape[3]))
    for anchor_size in anchor_sizes:
        for anchor_ratio in anchor_ratios:

            anchor_x = (anchor_size * anchor_ratio[0])/C.rpn_stride   #对应anchor在feature map上的宽度
            anchor_y = (anchor_size * anchor_ratio[1])/C.rpn_stride   #对应anchor在feature map上的高度
            # if dim_ordering == 'th':
            #     regr = regr_layer[0, 4 * curr_layer:4 * curr_layer + 4, :, :]
            # else:
            #     regr = regr_layer[0, :, :, 4 * curr_layer:4 * curr_layer + 4]  #当前anchor对应回归值
            #     regr = np.transpose(regr, (2, 0, 1))
            regr = regr_layer[0, :, :, 4 * curr_layer:4 * curr_layer + 4]  # 当前anchor对应回归值
            X, Y = np.meshgrid(np.arange(cols), np.arange(rows))

            A[0, :, :, curr_layer] = X - anchor_x/2   #左上点横坐标
            A[1, :, :, curr_layer] = Y - anchor_y/2   #左上纵横坐标
            A[2, :, :, curr_layer] = anchor_x   #暂时存储anchor 宽度
            A[3, :, :, curr_layer] = anchor_y   #暂时存储anchor 高度

            if use_regr:
                #通过rpn网络的回归层的预测值，来调整anchor位置
                A[:, :, :, curr_layer] = apply_regr_np(A[:, :, :, curr_layer], regr)

            A[2, :, :, curr_layer] = np.maximum(1, A[2, :, :, curr_layer])
            A[3, :, :, curr_layer] = np.maximum(1, A[3, :, :, curr_layer])
            A[2, :, :, curr_layer] += A[0, :, :, curr_layer]  #右下角横坐标
            A[3, :, :, curr_layer] += A[1, :, :, curr_layer]  #右下角纵坐标

            #确保anchor不超过feature map尺寸
            A[0, :, :, curr_layer] = np.maximum(0, A[0, :, :, curr_layer])