这篇文章解决了anchor手动生成的问题,
其想法也很简单,
在最近的论文中,产生了一种类似利用分割图的思想,比如FSAF,很有意思
这篇文章就是利用如上的思想,
原始框架如上图所示,
不过是把原来的RPN改为Guide anchoring,都是产生anchor和原始特征图。
Faster-RCNN是根据anchor产生的,该文则是直接预测每个点是否是anchor中心点。如下图所示:
对于每个点还应预测其anchor 的高度和宽度,
当然与fater rcnn不同的是,rpga 的feature map不再是原图,而是通过卷积后的图,即feature adaption,通过一个dcn来获取更多的特征信息。
class FeatureAdaption(nn.Module): #DCN操作
def __init__(self,
in_channels,
out_channels,
kernel_size=3,
deformable_groups=4):
super(FeatureAdaption, self).__init__()
offset_channels = kernel_size * kernel_size * 2
self.conv_offset = nn.Conv2d(
2, deformable_groups * offset_channels, 1, bias=False)
self.conv_adaption = DeformConv(
in_channels,
out_channels,
kernel_size=kernel_size,
padding=(kernel_size - 1) // 2,
deformable_groups=deformable_groups)
self.relu = nn.ReLU(inplace=True)
def init_weights(self):
normal_init(self.conv_offset, std=0.1)
normal_init(self.conv_adaption, std=0.01)
def forward(self, x, shape):
offset = self.conv_offset(shape.detach())
x = self.relu(self.conv_adaption(x, offset))
return x
def _init_layers(self):
self.relu = nn.ReLU(inplace=True)
self.conv_loc = nn.Conv2d(self.feat_channels, 1, 1) #预测中心点
self.conv_shape = nn.Conv2d(self.feat_channels, self.num_anchors * 2,
1)#预测宽高
self.feature_adaption = FeatureAdaption(
self.feat_channels,
self.feat_channels,
kernel_size=3,
deformable_groups=self.deformable_groups)
self.conv_cls = MaskedConv2d(self.feat_channels,
self.num_anchors * self.cls_out_channels,
1)
self.conv_reg = MaskedConv2d(self.feat_channels, self.num_anchors * 4,
1)
def forward_single(self, x):
loc_pred = self.conv_loc(x)
shape_pred = self.conv_shape(x)
x = self.feature_adaption(x, shape_pred)
# masked conv is only used during inference for speed-up
if not self.training:
mask = loc_pred.sigmoid()[0] >= self.loc_filter_thr
else:
mask = None
cls_score = self.conv_cls(x, mask)
bbox_pred = self.conv_reg(x, mask)
return cls_score, bbox_pred, shape_pred, loc_pred