函数功能介绍
这个函数是用来将RPN的输出转变为object proposals的。作者新增了ProposalLayer类,这个类中,重新了set_up和forward函数,其中forward实现了:生成锚点box、对于每个锚点提供box的参数细节、将预测框切成图像、删除宽、高小于阈值的框、将所有的(proposal, score) 对排序、获取 pre_nms_topN proposals、获取NMS 、获取 after_nms_topN proposals。
源码注释
import caffe
import numpy as np
import yaml
from fast_rcnn.config import cfg
from generate_anchors import generate_anchors
from fast_rcnn.bbox_transform import bbox_transform_inv, clip_boxes
from fast_rcnn.nms_wrapper import nms
DEBUG = False
class ProposalLayer(caffe.Layer):
"""
Outputs object detection proposals by applying estimated bounding-box
transformations to a set of regular boxes (called "anchors").
"""
def setup(self, bottom, top):
# parse the layer parameter string, which must be valid YAML
layer_params = yaml.load(self.param_str_)
self._feat_stride = layer_params['feat_stride']
anchor_scales = layer_params.get('scales', (8, 16, 32))
self._anchors = generate_anchors(scales=np.array(anchor_scales))
self._num_anchors = self._anchors.shape[0]
if DEBUG:
print 'feat_stride: {}'.format(self._feat_stride)
print 'anchors:'
print self._anchors
# rois blob: holds R regions of interest, each is a 5-tuple
# (n, x1, y1, x2, y2) specifying an image batch index n and a
# rectangle (x1, y1, x2, y2)
top[0].reshape(1, 5)
# scores blob: holds scores for R regions of interest
if len(top) > 1:
top[1].reshape(1, 1, 1, 1)
def forward(self, bottom, top):
# Algorithm:
#
# for each (H, W) location i
# generate A anchor boxes centered on cell i
# apply predicted bbox deltas at cell i to each of the A anchors
# clip predicted boxes to image
# remove predicted boxes with either height or width
anchor_target_layer.py
生成每个锚点的训练目标和标签,将其分类为1 (object), 0 (not object) , -1 (ignore).当label>0,也就是有object时,将会进行box的回归。其中,forward函数功能:在每一个cell中,生成9个锚点,提供这9个锚点的细节信息,过滤掉超过图像的锚点,测量同GT的overlap。
proposal_target_layer.py
对于每一个object proposal 生成训练的目标和标签,分类标签从0-k,对于标签>0的box进行回归。(注意,同anchor_target_layer.py不同,两者一个是生成anchor,一个是生成proposal)
generate.py
使用一个rpn生成object proposals。
Faster RCNN 源码解读(4) – RoI(Region-of-Interest Pooling)
Faster RCNN 源码解读(5) –