Caffe中crop_layer层的理解和使用

最新推荐文章于 2021-04-10 21:48:34 发布

Sunshine_in_Moon

最新推荐文章于 2021-04-10 21:48:34 发布

阅读量1.1w

点赞数 13

CC 4.0 BY-SA版权

分类专栏： caffe

本文链接：https://blog.youkuaiyun.com/sunshine_in_moon/article/details/52900338

caffe 专栏收录该内容

40 篇文章

订阅专栏

本文主要介绍了Caffe中Crop_layer的作用及其使用方法，并通过源码解读深入理解该层的工作原理。Crop_layer主要用于图像数据的裁剪操作，尤其适用于全卷积网络中去除pad后的多余部分。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前段时间一直忙着找工作博客已经很久没有写了，看到了很多人的留言没有回复，在这里和大家说声抱歉。Caffe也是很久没有使用了，前天突然发现Caffe更新了，出现了一些新层，于是就挑着在论文中使用到的新层研究了一下。

本片博客主要是说明crop_layer（我叫他剪裁层）的理解和使用。在此申明博客中的内容部分引用其他博客我会给出连接地址，大家可以详细看原博客。

1、Crop_layer有什么作用呢？

Crop_layer的主要作用就是进行剪裁。Caffe中的数据是以 blobs形式存在的，blob是四维数据，即（Batch size, number of Chennels, Height, Width）=(N, C, H, W)。---（0,1,2,3）。

Crop层的输入（bottom blobs）有两个，让我们假设为A和B，输出（top）为C。

（1）A是要进行裁切的bottom，他的size是（20,50,512,512）；

（2）B是剪裁的参考输入，它的size是（20,10,256,256）；

（3）C是输出，由A剪裁而来，那么他的size是（20,10,256,256）。

Crop_layer里有两个重要的参数axis，offsets。axis决定从哪个轴开始剪裁，offsets决定A的偏移，从偏移位置开始剪裁，剪裁的长度就是B中对应轴的长度。举例如下：

（1）axis=1,offset=(25,128,128)

（2）corp operation:C=A[:,25:25+B.shape[1],128:128+B.shape[2],128:128+B.shape[3]].

以上内容出自http://www.cnblogs.com/kunyuanjushi/p/5937083.html

2、Crop_layer在那篇论文里具体应用了呢？

看过Fully Convolutional Networks for Semantic Segmentation 这篇论文的同学应该有印象。没错我也是研究这篇论文时，发现作者提供的网络配置文件里使用了Crop_layer层。

那么他在这篇论文里有什么作用呢？可参考：https://www.zhihu.com/question/48260036

主要是全卷积时原始图像加了pad，比原图大一些，最后要把pad剪裁掉。

3、重磅来袭，源码解读。

首先，给出源码中重要的部分crop_copy函数

void CropLayer<Dtype>::crop_copy(const vector<Blob<Dtype>*>& bottom,// bottom[0]
             const vector<Blob<Dtype>*>& top,
             const vector<int>& offsets,
             vector<int> indices,//初始化时都是0
             int cur_dim,//默认从0开始
             const Dtype* src_data,
             Dtype* dest_data,
             bool is_forward) {
  if (cur_dim + 1 < top[0]->num_axes()) {
    // We are not yet at the final dimension, call copy recursively
	// 还没到最后一个维度，递归调用crop_copy()
    for (int i = 0; i < top[0]->shape(cur_dim); ++i) {
      indices[cur_dim] = i;
      crop_copy(bottom, top, offsets, indices, cur_dim+1,
                src_data, dest_data, is_forward);
    }
  } else {
    // We are at the last dimensions, which is stored continously(连续) in memory
    for (int i = 0; i < top[0]->shape(cur_dim); ++i) {
      // prepare index vector reduced(red) and with offsets(off) 准备索引向量
      std::vector<int> ind_red(cur_dim, 0); //顶层的偏移向量
      std::vector<int> ind_off(cur_dim+1, 0);//底层的偏移向量
      for (int j = 0; j < cur_dim; ++j) {//注意这里的cur_dim=3，因此j最大为2，ind_red[0]初始化时是0
          ind_red[j] = indices[j];
          ind_off[j] = indices[j] + offsets[j];
      }
      ind_off[cur_dim] = offsets[cur_dim];//ind_off最后一维
      // do the copy  复制操作
      if (is_forward) {
        caffe_copy(top[0]->shape(cur_dim),
            src_data + bottom[0]->offset(ind_off),
            dest_data + top[0]->offset(ind_red));
      } else {
        // in the backwards pass the src_data is top_diff
        // and the dest_data is bottom_diff
		// 后向过程src_data是top_diff，dest_data是bottom_diff
        caffe_copy(top[0]->shape(cur_dim),
            src_data + top[0]->offset(ind_red),
            dest_data + bottom[0]->offset(ind_off));
      }
    }
  }
}

由于解释起来较为复杂，特把我的笔记贴出，如有错误望留言告知