generate_anchors.py代码解析

最新推荐文章于 2023-12-27 12:12:37 发布

原创最新推荐文章于 2023-12-27 12:12:37 发布 · 570 阅读

0 ·

CC 4.0 BY-SA版权

faster_rcnn代码详解专栏收录该内容

16 篇文章

订阅专栏

本文解析了目标检测中Anchor机制的本质，即SPP思想的逆向应用，详细介绍了如何通过不同纵横比和尺度生成9种基本Anchor。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

anchor
首先我们需要知道anchor的本质是什么，本质是SPP(spatial pyramid pooling)思想的逆向。而SPP本身是做什么的呢，就是将不同尺寸的输入resize成为相同尺寸的输出。所以SPP的逆向就是，将相同尺寸的输出，倒推得到不同尺寸的输入。接下来是anchor的窗口尺寸，这个不难理解，三个面积尺寸（128^2，2562，512^2），然后在每个面积尺寸下，取三种不同的长宽比例（1:1,1:2,2:1）.这样一来，我们得到了一共9种面积尺寸各异的anchor。

作者：马塔
链接：https://www.zhihu.com/question/42205480/answer/155759667
来源：知乎
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。
/home/ubuntu/tf-faster-rcnn-master/lib/layer_utils/generate_anchors.py

# --------------------------------------------------------
# Faster R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick and Sean Bell
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np


# Verify that we compute the same anchors as Shaoqing's matlab implementation:
#
#    >> load output/rpn_cachedir/faster_rcnn_VOC2007_ZF_stage1_rpn/anchors.mat
#    >> anchors
#
#    anchors =
#
#       -83   -39   100    56  w=184 h=96
#      -175   -87   192   104  w=
#      -359  -183   376   200
#       -55   -55    72    72
#      -119  -119   136   136
#      -247  -247   264   264
#       -35   -79    52    96
#       -79  -167    96   184
#      -167  -343   184   360

# array([[ -83.,  -39.,  100.,   56.],
#       [-175.,  -87.,  192.,  104.],
#       [-359., -183.,  376.,  200.],
#       [ -55.,  -55.,   72.,   72.],
#       [-119., -119.,  136.,  136.],
#       [-247., -247.,  264.,  264.],
#       [ -35.,  -79.,   52.,   96.],
#       [ -79., -167.,   96.,  184.],
#       [-167., -343.,  184.,  360.]])

def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
                     scales=2 ** np.arange(3, 6)):#anchors的大小，尺度分别为2^3,2^4，2^5,(8,16,32)
                     # anchor的面积尺寸就是16*8,16*16,16*32
  """
  Generate形成 anchor (reference) windows by enumerating列举 aspect ratios 纵横比X
  scales wrt a reference (0, 0, 15, 15) window.
  """

  base_anchor = np.array([1, 1, base_size, base_size]) - 1 #（0,0,15,15）参考anchors
  ratio_anchors = _ratio_enum(base_anchor, ratios) #得到ws,hs,x_ctr, y_ctr anchor 矩阵
  anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
                       for i in range(ratio_anchors.shape[0])])#将不同纵横比的anchor，进行不同尺度变换，并将结果沿竖直(按行顺序)方法把数组给堆叠起来
  return anchors


def _whctrs(anchor):
  """
  Return width, height, x center, and y center for an anchor (window)。anchors中心的（x,y）坐标
  """

  w = anchor[2] - anchor[0] + 1
  h = anchor[3] - anchor[1] + 1
  x_ctr = anchor[0] + 0.5 * (w - 1)
  y_ctr = anchor[1] + 0.5 * (h - 1)
  return w, h, x_ctr, y_ctr


def _mkanchors(ws, hs, x_ctr, y_ctr):
  """
  Given a vector of widths (ws) and heights (hs) around a center
  (x_ctr, y_ctr), output a set of anchors (windows).将w,h,x_ctr,yctr放入一个向量中，得到一组anchors 窗口坐标
  """

  ws = ws[:, np.newaxis] #np.newaxis插入新的维度
  hs = hs[:, np.newaxis]
  anchors = np.hstack((x_ctr - 0.5 * (ws - 1),#xmin
                       y_ctr - 0.5 * (hs - 1),#ymin 
                       x_ctr + 0.5 * (ws - 1),#xmax
                       y_ctr + 0.5 * (hs - 1)))#水平(按列顺序)把数组给堆叠起来
  return anchors


def _ratio_enum(anchor, ratios):
  """
  Enumerate列举 a set of anchors for each aspect ratio wrt an anchor.不同纵横比的anchor
  """

  w, h, x_ctr, y_ctr = _whctrs(anchor)# 获取宽、高、中心坐标
  size = w * h#面积
  size_ratios = size / ratios#ratios=[0.5, 1, 2]，获得三种尺寸的anchor
  ws = np.round(np.sqrt(size_ratios))#开平方，然后四舍五入到个位数
  hs = np.round(ws * ratios)#返回浮点数x的四舍五入值，默认小数位为0
  anchors = _mkanchors(ws, hs, x_ctr, y_ctr)#添加到矩阵中
  return anchors #返回anchor矩阵


def _scale_enum(anchor, scales):
  """
  Enumerate a set of anchors for each scale wrt an anchor.不同尺度的anchor
  """

  w, h, x_ctr, y_ctr = _whctrs(anchor)
  ws = w * scales
  hs = h * scales
  anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
  return anchors


if __name__ == '__main__':
  import time

  t = time.time()#开始计时
  a = generate_anchors()#得到anchors
  print(time.time() - t)#生成anchors所用的时间
  print(a)
  from IPython import embed;
#断点调试
  embed()

ipython与python的区别：
IPython的开发者吸收了标准解释器的基本概念，在此基础上进行了大量的改进，创造出一个令人惊奇的工具。在它的主页上是这么说的：“这是一个增强的交互式Python
shell。”具有tab补全，对象自省，强大的历史机制，内嵌的源代码编辑，集成Python调试器，%run机制，宏，创建多个环境以及调用系统shell的能力。
1）IPython与标准Python的最大区别在于，Ipython会对命令提示符的每一行进行编号。如下图：这里写图片描述

2）tab补全

作为例子，我们先引入 sys 模块，之后再输入 sys. (注意有个点)，此时按下 tab 键，IPython 会列出所有 sys
模块下的方法和属性。
这里写图片描述
接着上面的例子，我们输入 sys?再回车，这样会显示出 sys 模块的 docstring及相关信息。很多时候这个也是很方便的功能。

3）历史机制

hist可以快速查看那些输入的历史记录。
hist -n可以快速查看并去掉历史记录中的序号，这样你就可以方便的将代码复制到一个文本编辑器中。
一个更简单的方法是edit加Python列表的切片（slice）语法：
edit 4:7 % 将第4，5，6，7句代码导出到编辑器

4）断点调试：如果你的程序是由命令行开始执行的，即在命令行下输入 python foo.py（大部分 Python 程序都是），那么你还可以利用 IPython 在你的程序任意地方进行断点调试。
在你程序中任意地方，加入如下语句：
from IPython.Shell import IPShellEmbed
IPShellEmbed([])()
注意：最近 IPython 发布了 0.11 版本，各方面变化都非常大，API 也经过了重新设计。如果你使用的是 0.11 那么上面两行对应的是这样的:
from IPython import embed
embed()
再和平常一样运行你的程序，你会发现在程序运行到插入语句的地方时，会转到 IPython 环境下。你可以试试运行些指令，就会发现此刻 IPython 的环境就是在程序的那个位置。你可以逐个浏览当前状态下的各个变量，调用各种函数，输出你感兴趣的值来帮助调试。之后你可以照常退出 IPython，然后程序会继续运行下去，自然地你在当时 IPython 下执行的语句也会对程序接下来的运行造成影响。
这个方法是在这里（http://lukeplant.me.uk/blog/posts/exploratory-programming-with-ipython/）看到的。想象一下，这样做就像让高速运转的程序暂停下来，你再对运行中的程序进行检查和修改，之后再让他继续运行下去。这里举一个例子，比如编写网页 bot ，你在每取回一个页面后你都得看看它的内容，再尝试如何处理他获得下一个页面的地址。运用这个技巧，你可以在取回页面后让程序中断，再那里实验各种处理方法，在找到正确的处理方式后写回到你的代码中，再进行下一步。这种工作流程只有像 Python 这种动态语言才可以做到。

以上摘自http://blog.sina.com.cn/s/blog_6fb8aa0d0101r5o1.html