Created with Raphaël 2.2.0
FPN-Pi
generate anchors(3*(H/2^i)*(W/2^i), 4)
sort(score top N1)
adjust bbox
NMS(Post top N2)
Proposals_i
该层的输出为ndarray, shape=(2000, 5),第二维的第一位是batch_inx,由于batchsize为1,因此第一位都为0.
3.2 PyramidMaskTarget 该层同样不包含训练参数,两个主要函数add_proposals,sample_rios. add_proposals : 该函数实现了将2000个anchor信息添加到单张图片的roidb标注中。 初始roidb为:
{
'boxes' : ndarray, shape= ( 8 , 4 ) ,
'segms' : list , len = 8 ,
‘seg_areas’, list , len = 8
'gt_classes' : ndarray, len = 8 , [ 1 1 1 3 3 3 3 6 ]
'gt_overlaps' : 稀疏矩阵, bbox和gt_box的重叠比例,由于当前bbox都是gt_box,只有和其对应类别的处的值为1 ,shape= ( num_box, num_class)
‘is_crowd’: ndarray. len = 8
'box_to_gt_ind_map' : ndarray, shape= ( 8 , ) bbox所对应的gt_box编号
}
通过计算RPN产生的2000个proposals和gt box的重叠比例,修改一字段的值。
boxes,直接将2000个proposals添加到list中 seg_areas,添加2000个0值 gt_classes, 添加2000个0值 gt_overlaps, 添加2000个值,若无重叠则为0,若与一个或多个重叠,取最大值 box_to_gt_ind_map: 添加2000个值,无重叠则为-1, 重叠则为最接近的gt box在gt classes中的序号 sample_rois :
计算bbox和gt box的最大重叠比例,根据阈值进行划分,大于阈值记为前景,小于阈值记为背景。 对前后景rois进行采样,使得总和为rois-per-image, 网络中设置为512个. 计算512个bbox相对于其gt-box的坐标变换系数,即bbox-targets 更改bbox-targets的表示形式,从[class, dx, dy, dh, dw]变成[0,0,0,0,…,dx,dy,dw,dh,…,0,0,0,0],len=num-class * 4, 每四个数对应一个类别,类似于one-hot编码,只在其对应类别处有非零值 为每一个fg-roi制作28*28的mask {
'labels_int32' : ndarray, ( 512 , ) , 若是gt- box,则为对应类别;若非gt- box,则为0 ,
'rois' : ( 512 , 5 ) , 第二维第一位为batch- inx, 都为0 ,
'bbox-targets' :shape(512 , 36 ),
'bbox-inside-weights' : ( 512 , 36 ) , 对应bbox- targets, 在对应类别处值为1 , 1 , 1 , 1
'bbox-outside-weights' : ( 512 , 36 ) , 对应bbox- inside- weights, 在非零值处值为1
'nongt_inds' : 不是gtbox的编号, (504 ,) - > 有8 个gtbox
'mask_rois' : 前景rois, shape( 前景rois个数, batchinx+ 四个系数) = ( 9 ,5 ) ,
'rois_has_mask_int32' : 有mask的rois,即属于前景的rois,shape= ( 512 , ) ,是则为1 ,不是为0
'mask_int32' : 根据fg- roi宽高比例进行调整的28 * 28 的mask, 采用和bbox- targets类似的表示方式,其shape= ( 前景个数, 类别数×28 * 28 ) ,在对应属于的类别上,连续的28 * 28 有值(0 或1 ),在不属于的类别上,为- 1
}
4. Sementic Segmentation Head 4.1 FCN-Head FCN-Head即用于生成语义分割的分支。承接FPN的四个分支,每个分支分别经过可变卷积,而后同一scale到H/4、W/4,concat之后再卷积得到每个像素点的在19个类别上的概率。同时计算gt-roi mask内每个像素点的在19个类别上的概率。源码及结构如下:
def forward ( self, fpn_p2, fpn_p3, fpn_p4, fpn_p5, roi= None ) :
fpn_p2 = self. fcn_subnet( fpn_p2)
fpn_p3 = self. fcn_subnet( fpn_p3)
fpn_p4 = self. fcn_subnet( fpn_p4)
fpn_p5 = self. fcn_subnet( fpn_p5)
fpn_p3 <span class="token operator">=</span> F<span class="token punctuation">.</span>interpolate<span class="token punctuation">(</span>fpn_p3<span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">,</span> <span class="token number">2</span><span class="token punctuation">,</span> mode<span class="token operator">=</span><span class="token string">'bilinear'</span><span class="token punctuation">,</span> align_corners<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span>
fpn_p4 <span class="token operator">=</span> F<span class="token punctuation">.</span>interpolate<span class="token punctuation">(</span>fpn_p4<span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">,</span> <span class="token number">4</span><span class="token punctuation">,</span> mode<span class="token operator">=</span><span class="token string">'bilinear'</span><span class="token punctuation">,</span> align_corners<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span>
fpn_p5 <span class="token operator">=</span> F<span class="token punctuation">.</span>interpolate<span class="token punctuation">(</span>fpn_p5<span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">,</span> <span class="token number">8</span><span class="token punctuation">,</span> mode<span class="token operator">=</span><span class="token string">'bilinear'</span><span class="token punctuation">,</span> align_corners<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span>
feat <span class="token operator">=</span> torch<span class="token punctuation">.</span>cat<span class="token punctuation">(</span><span class="token punctuation">[</span>fpn_p2<span class="token punctuation">,</span> fpn_p3<span class="token punctuation">,</span> fpn_p4<span class="token punctuation">,</span> fpn_p5<span class="token punctuation">]</span><span class="token punctuation">,</span> dim<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">)</span>
score <span class="token operator">=</span> self<span class="token punctuation">.</span>score<span class="token punctuation">(</span>feat<span class="token punctuation">)</span>
ret <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token string">'fcn_score'</span><span class="token punctuation">:</span> score<span class="token punctuation">,</span> <span class="token string">'fcn_feat'</span><span class="token punctuation">:</span> feat<span class="token punctuation">}</span>
<span class="token keyword">if</span> self<span class="token punctuation">.</span>upsample_rate <span class="token operator">!=</span> <span class="token number">1</span><span class="token punctuation">:</span>
output <span class="token operator">=</span> F<span class="token punctuation">.</span>interpolate<span class="token punctuation">(</span>score<span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">,</span> self<span class="token punctuation">.</span>upsample_rate<span class="token punctuation">,</span> mode<span class="token operator">=</span><span class="token string">'bilinear'</span><span class="token punctuation">,</span> align_corners<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span>
ret<span class="token punctuation">.</span>update<span class="token punctuation">(</span><span class="token punctuation">{</span><span class="token string">'fcn_output'</span><span class="token punctuation">:</span> output<span class="token punctuation">}</span><span class="token punctuation">)</span>
<span class="token keyword">if</span> roi <span class="token keyword">is</span> <span class="token operator">not</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
roi_feat <span class="token operator">=</span> self<span class="token punctuation">.</span>roipool<span class="token punctuation">(</span>feat<span class="token punctuation">,</span> roi<span class="token punctuation">)</span>
roi_score <span class="token operator">=</span> self<span class="token punctuation">.</span>score<span class="token punctuation">(</span>roi_feat<span class="token punctuation">)</span>
ret<span class="token punctuation">.</span>update<span class="token punctuation">(</span><span class="token punctuation">{</span><span class="token string">'fcn_roi_score'</span><span class="token punctuation">:</span> roi_score<span class="token punctuation">}</span><span class="token punctuation">)</span>
<span class="token keyword">return</span> ret