【代码】CenterNet使用（Detection）（demo.py）

最新推荐文章于 2023-10-25 16:43:46 发布

原创最新推荐文章于 2023-10-25 16:43:46 发布 · 5.1k 阅读

40 ·

CC 4.0 BY-SA版权

本文详细解析了CenterNet在VOC数据集上的运行流程，包括环境搭建、问题解决、代码逐行解析及性能优化技巧。

PyTorch 2.5

PyTorch

Cuda

PyTorch 是一个开源的 Python 机器学习库，基于 Torch 库，底层由 C++ 实现，应用于人工智能领域，如计算机视觉和自然语言处理

一、运行demo.py

按照readme里头的创建一个新环境，按照要求安装即可，中间也遇到了不少的问题，比如说一开始装上了torch0.4.1，之后不能安装torchvision，所以又升到了torch1.0，安装完torchvison之后又回退到了torch0.4.1版本。中间也遇到过DCNv2编译失败的情况，把失败的DCNv2删掉后再重新编译，安装了一些必要组件，好像是那个ffi的模块，然后莫名其妙就编译成功了，之后的external也顺利安装成功了。

除此之外，由于demo最后要显示图片，总是显示"X Server没有连接"，还费劲安装了X manager。

安装完之后在服务器shell上输入： export DISPLAY= <本地IP>:0.0应该就可以在Xmanager上显示了。X manager上也运行了XStart，填入服务器地址、用户名和密码就可以了，Xstart应该也是要启用才能显示的。前一步可能不是必须的，但是一定要关闭防火墙。

在Pycharm上也能够运行远程的shell，Tools——>Start SSH session就可以了。

以上这些都是准备工作，下面先一步一步的解析代码，让他能够在VOC数据集上运行。

运行demo的命令是：python demo.py ctdet --demo ../images --load_model ../models/ctdet_coco_dla_2x.pth

注：

如果遇到ffi的问题： torch.utils.ffi is deprecated.，按照以下方法来做

https://github.com/xingyizhou/CenterNet/issues/7

如果遇到cv2的版本太高，无法显示的话：

cv2.error: OpenCV(4.1.1) /io/opencv/modules/highgui/src/window.cpp:627: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvShowImage'
可以将src/lib/utils/debugger.py中的self.ipynb(215行show_all_imgs)直接设置成False，这样就不会调用imshow，但是可以使用里plt来显示。

如果遇到显示图片卡住：

可以尝试修改src/lib/utils/debugger.py中的show_all_imgs的waitKey。或许会管用

二、解析准备工作

尝试debug，在Pycharm上出现warning: Debugger speedups using cython not found.以及Connected to pydev debugger 错误。

第一步：将Pycharm上面的解释器替换为CenterNet环境中的解释器，位置在anaconda3/envs/CenterNet/bin

第二步：在pycharm_helpers/pydev/中运行python setup_cython.py build_ext --inplace，建立debugger，之后就能够debug了。

之后还出现了Python Console的错误，按照下图修改即可：

，，将Python interpreter修改为CenterNet中的解释器。

三、流程

1. _init_paths：添加.../CenterNet/src/lib（绝对路径）到sys.path中，并优先启用。之后import的根目录就是lib目录了。

2. 引入opts类，opts类就是一个parser，存在self.parser中，方法parse用来预处理参数，方法init用来初始化，接收输入参数。

if __name__ == '__main__':
  opt = opts().init()
  demo(opt)

接收到的参数放到opt中，传入demo函数中。

3. 两句代码建立Detector

from detectors.detector_factory import detector_factory
Detector = detector_factory[opt.task]
detector = Detector(opt)

detector_factory是如下样子的一个字典，Key是名字，Value是各个Detector类。

detector_factory = {
  'exdet': ExdetDetector, 
  'ddd': DddDetector,
  'ctdet': CtdetDetector,
  'multi_pose': MultiPoseDetector, 
}

detector使用opt初始化了一个检测器。由于不是使用的摄像头，接下来执行如下代码：

  else:
    if os.path.isdir(opt.demo):
      image_names = []
      ls = os.listdir(opt.demo)
      for file_name in sorted(ls):
          ext = file_name[file_name.rfind('.') + 1:].lower()
          if ext in image_ext:
              image_names.append(os.path.join(opt.demo, file_name))
    else:
      image_names = [opt.demo]
    
    for (image_name) in image_names:
      ret = detector.run(image_name)
      time_str = ''
      for stat in time_stats:
        time_str = time_str + '{} {:.3f}s |'.format(stat, ret[stat])
      print(time_str)

第一部分获得image_names，就是images文件夹中的图片。

第二部分就是挨个处理图片，detector.run(image_name)

第三部分就是输出时间，time_str

输出结果如下：

重点考察detector.run(image_name)。

5. detector.run，在detectors/ctdet.py中。run的部分继承自base_detector。

三、解析

首先要学习parser的知识，这个博文写的很好：https://www.cnblogs.com/lovemyspring/p/3214598.html

修改为以下代码：

  args = ['ctdet',
          '--demo', '../images',
          '--load_model', '../models/ctdet_coco_dla_2x.pth']
  opt = opts().init(args=args)

将demo中的第19行修改后可以不显示图像：

  opt.debug = 0# max(opt.debug, 1)

接下来解析detector.run。

第一部分是建立debugger：

  def run(self, image_or_path_or_tensor, meta=None):
    load_time, pre_time, net_time, dec_time, post_time = 0, 0, 0, 0, 0
    merge_time, tot_time = 0, 0
    debugger = Debugger(dataset=self.opt.dataset, ipynb=(self.opt.debug==3),
                        theme=self.opt.debugger_theme)
    start_time = time.time()
    pre_processed = False

debugger的代码在utils/debugger中。传入的三个参数为：dataset='coco'，ipynb=False, theme='white'。

之后debugger初始化，有这些参数。

第二部分是读取图片，存放在image中：

    if isinstance(image_or_path_or_tensor, np.ndarray):
      image = image_or_path_or_tensor
    elif type(image_or_path_or_tensor) == type (''): 
      image = cv2.imread(image_or_path_or_tensor)
    else:
      image = image_or_path_or_tensor['image'][0].numpy()
      pre_processed_images = image_or_path_or_tensor
      pre_processed = True

之后进行预处理，self.scales=[1.0]，pre_processed=False，进行self.pre_process，产生images和meta：

第三部分是preprocess，在pre_process方法中：

  def pre_process(self, image, scale, meta=None):
    height, width = image.shape[0:2]
    new_height = int(height * scale)
    new_width  = int(width * scale)
    if self.opt.fix_res:
      inp_height, inp_width = self.opt.input_h, self.opt.input_w
      c = np.array([new_width / 2., new_height / 2.], dtype=np.float32)
      s = max(height, width) * 1.0
    else:
      inp_height = (new_height | self.opt.pad) + 1
      inp_width = (new_width | self.opt.pad) + 1
      c = np.array([new_width // 2, new_height // 2], dtype=np.float32)
      s = np.array([inp_width, inp_height], dtype=np.float32)

    trans_input = get_affine_transform(c, s, 0, [inp_width, inp_height])
    resized_image = cv2.resize(image, (new_width, new_height))
    inp_image = cv2.warpAffine(
      resized_image, trans_input, (inp_width, inp_height),
      flags=cv2.INTER_LINEAR)
    inp_image = ((inp_image / 255. - self.mean) / self.std).astype(np.float32)

    images = inp_image.transpose(2, 0, 1).reshape(1, 3, inp_height, inp_width)
    if self.opt.flip_test:
      images = np.concatenate((images, images[:, :, :, ::-1]), axis=0)
    images = torch.from_numpy(images)
    meta = {'c': c, 's': s, 
            'out_height': inp_height // self.opt.down_ratio, 
            'out_width': inp_width // self.opt.down_ratio}
    return images, meta

首先进行尺度变换，由于尺度只有1.0，不变，产生resized_image，之后self.opt.fix_res = True，将图像尺寸归一化为512*512，产生inp_image，最后还要进行归一化，并且reshape，得到images（Tensor），以及meta（dict），里头有

{c：resized_image的中心位置，

s：最长宽度，

'out_height'： inp_height // 下采样率=输出heatmap高度,

'out_width': inp_width // 下采样率}

第四部分进一步处理数据，放入GPU中：

      images = images.to(self.opt.device)
      torch.cuda.synchronize()
      pre_process_time = time.time()
      pre_time += pre_process_time - scale_start_time

其中torch.cuda.synchronize()是为了让所有核同步，测得真实时间。

第五部分放入网络中测试，产生输出：

      output, dets, forward_time = self.process(images, return_time=True)

process部分在ctdet.py中：

  def process(self, images, return_time=False):
    with torch.no_grad():
      output = self.model(images)[-1]
      hm = output['hm'].sigmoid_()
      wh = output['wh']
      reg = output['reg'] if self.opt.reg_offset else None
      if self.opt.flip_test:
        hm = (hm[0:1] + flip_tensor(hm[1:2])) / 2
        wh = (wh[0:1] + flip_tensor(wh[1:2])) / 2
        reg = reg[0:1] if reg is not None else None
      torch.cuda.synchronize()
      forward_time = time.time()
      dets = ctdet_decode(hm, wh, reg=reg, K=self.opt.K)
      
    if return_time:
      return output, dets, forward_time
    else:
      return output, dets

首先将images放入model中，就得到output了。output具有三个部分

{'hm': 1*80*128*128,

'reg': 1*2*128*128,

'wh': 1*2*128*128}，可以看出来，只有hm（heatmap）是与类别相关的，reg（offset）和wh（width & height）是与类别无关的。

之后使用ctdet_decode进行解码，得到dets，dets是1*100*6的张量。

最终，返回outputs，dets，forward_time。

第六部分对得到得dets进行后处理：

      dets = self.post_process(dets, meta, scale)
      torch.cuda.synchronize()
      post_process_time = time.time()
      post_time += post_process_time - decode_time

      detections.append(dets)

post_process在ctdet.py中出现：

  def post_process(self, dets, meta, scale=1):
    dets = dets.detach().cpu().numpy()
    dets = dets.reshape(1, -1, dets.shape[2])
    dets = ctdet_post_process(
        dets.copy(), [meta['c']], [meta['s']],
        meta['out_height'], meta['out_width'], self.opt.num_classes)
    for j in range(1, self.num_classes + 1):
      dets[0][j] = np.array(dets[0][j], dtype=np.float32).reshape(-1, 5)
      dets[0][j][:, :4] /= scale
    return dets[0]

使用ctdet_post_process进行后处理，

最后得到得dets是一个len=80的张量（列表？）

其中每个元素是一个N * 5的ndarray。猜测是每个类别中的dets。detections是一个包含dets的列表。

第七部分进行最后的后处理：

    results = self.merge_outputs(detections)
    torch.cuda.synchronize()
    end_time = time.time()
    merge_time += end_time - post_process_time
    tot_time += end_time - start_time

merge_outputs在ctdet.py中：

  def merge_outputs(self, detections):
    results = {}
    for j in range(1, self.num_classes + 1):
      results[j] = np.concatenate(
        [detection[j] for detection in detections], axis=0).astype(np.float32)
      if len(self.scales) > 1 or self.opt.nms:
         soft_nms(results[j], Nt=0.5, method=2)
    scores = np.hstack(
      [results[j][:, 4] for j in range(1, self.num_classes + 1)])
    if len(scores) > self.max_per_image:
      kth = len(scores) - self.max_per_image
      thresh = np.partition(scores, kth)[kth]
      for j in range(1, self.num_classes + 1):
        keep_inds = (results[j][:, 4] >= thresh)
        results[j] = results[j][keep_inds]
    return results

您可能感兴趣的与本文相关的镜像

PyTorch 2.5

PyTorch

Cuda

PyTorch 是一个开源的 Python 机器学习库，基于 Torch 库，底层由 C++ 实现，应用于人工智能领域，如计算机视觉和自然语言处理

15 条评论

CT黑汤圆 2023.05.05
亲们，centernet中在更改输入分辨率后（原512*512），需要改哪里保证测试时数据集分辨率与训练时保持一致呢？！

weixin_41900748 2020.04.14
博主您好，请教一下，如果换成其他的模型（如cet_coco_res18.pth或cet_coco_res50.pth），请问哪些地方需要修改才能运行
- mingyido回复weixin_41900748 2022.12.07
  兄弟你解决这个问题了吗

weixin_46410975 2020.03.09
博主你好，测试环节有可能在cpu下完成吗，谢谢
- FSALICEALEX回复weixin_46410975 2020.03.09
  [reply]weixin_46410975[/reply]应该是可以的

xiaoxiaobaiBaixiao 2019.12.15
博主，您好，我想问一下，centernet最重要的就是heatmap用来预测关键点的，我想输出heatmap，将opt.debug=2，输出的那个hm就是这个预测中心点的heatmap吗
- FSALICEALEX回复xiaoxiaobaiBaixiao 2019.12.17
  [reply]weixin_44104146[/reply] 上面提到的output中的hm是用来预测关键点的，opt.debug=2我并没有试过。可以可视化一下hm，就知道它是什么了。

pikapika_jiujiu 2019.12.13
博主您好~我想保存centernet测试图片之后的bbox输出，请问博主应该怎么修改呀~
- FSALICEALEX回复pikapika_jiujiu 2019.12.14
  [reply]weixin_42992009[/reply] 我觉得可以设置一个全局变量，保存所有的results。就在上面提到的"第七部分进行最后的后处理"这块儿把results给存到全局变量里。然后这个全局变量可以在主函数中输出啊，保存啊啥的。我也挺久没看这个代码了。。这个方法可以试试

studieren666 2019.11.20
好活，给博主点赞～

ksung 2019.10.19
您好,我看了您达博客关于centernet demo代码解读的,我在跑demo的时候出现了一点问题,结果并没有像您博客里那样,请而是在loaded并跑出结果后卡住,请问我可以私聊您吗
- 杀生丸学AI回复FSALICEALEX 2020.06.22
  我也想要，博主可以加我吗？496598212，微信QQ都行
- FSALICEALEX回复ksung 2019.10.20
  [reply]Csharpprime[/reply] 已发
- ksung回复FSALICEALEX 2019.10.20
  [reply]FSALICEALEX[/reply] kunsong_npu@foxmail.com,这是我邮箱,请问可以发给我您的微信或者qq吗
- FSALICEALEX回复ksung 2019.10.20
  [reply]Csharpprime[/reply] 好啊