【代码阅读】PVCNN

本文详细解析了Point-Voxel CNN (PVCNN) 的工作原理,包括Voxelization和Devoxelization过程。通过阅读PVCNN的源代码,从python、cpp到cu,深入探讨了前向计算和反向传播的实现细节,特别是Voxelization中的坐标转换和Devoxelization中的三线性插值方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Point-Voxel CNN for Efficient 3D Deep Learning, 2019 NIPS
code:https://github.com/mit-han-lab/pvcnn

文章的的解读可以看我另一篇博客

PVconv的具体实现在pvcnn-master/modules/pvconv.py

voxel_features, voxel_coords = self.voxelization(features, coords)
voxel_features = self.voxel_layers(voxel_features)
voxel_features = F.trilinear_devoxelize(voxel_features, voxel_coords, self.resolution, self.training)
fused_features = voxel_features + self.point_features(features)

Voxelization

python

首先,根据上面代码可以看到,是把point-wise的feature和coords传入,那我们也跟进去看,pvcnn-master/modules/voxelization.py:

class Voxelization(nn.Module):
    def __init__(self, resolution, normalize=True, eps=0):
        super().__init__()
        self.r = int(resolution)
        self.normalize = normalize
        self.eps = eps

    def forward(self, features, coords)
        coords = coords.detach()
		# 把coords归一到局部坐标系中,先减去均值
        norm_coords = coords - coords.mean(2, keepdim=True)
        if self.normalize:
        	# 找到最远的点当做半径,然后每个点除以2*半径,将坐标归一到[-0.5,0.5],然后加上0.5
            norm_coords = norm_coords / (norm_coords.norm(dim=1, keepdim=True).max(dim=2, keepdim=True).values * 2.0 + self.eps) + 0.5
        else:
            norm_coords = (norm_coords + 1) / 2.0
        # resolution是正整数,将norm_coords从[0,1]放大到[0,r-1]
        norm_coords = torch.clamp(norm_coords * self.r, 0, self.r - 1)
        # 通过round,得到vox_coords,vox_coords的取值是[0,r-1]的整数,一共r个值
        vox_coords = torch.round(norm_coords).to(torch.int32)
        # 前向计算,进行voxelize
        return F.avg_voxelize(features, vox_coords, self.r), norm_coords

把feature和vox_coords传入,那么我们也跟进去看,pvcnn-master/modules/functional/voxelization.py:

class AvgVoxelization(Function):
    @staticmethod
    def forward(ctx, features, coords, resolution):
        """
        :param ctx:
        :param features: Features of the point cloud, FloatTensor[B, C, N]
        :param coords: Voxelized Coordinates of each point, IntTensor[B, 3, N]
        :param resolution: Voxel resolution
        :return:
            Voxelized Features, FloatTensor[B, C, R, R, R]
        """
        features = features.contiguous()
        coords = coords.int().contiguous()
        b, c, _ = features.shape
        # 前向计算
        out, indices, counts = _backend.avg_voxelize_forward(features, coords, resolution)
        ctx.save_for_backward(indices, counts)
        return out.view(b, c, resolution, resolution, resolution)

    @staticmethod
    def backward(ctx, grad_output):
        """
        :param ctx:
        :param grad_output: gradient of output, FloatTensor[B, C, R, R, R]
        :return:
            gradient of inputs, FloatTensor[B, C, N]
        """
        b, c = grad_output.shape[:2]
        indices, counts = ctx.saved_tensors
        # 反向传播
        grad_features = _backend.avg_voxelize_backward(grad_output.contiguous().view(b, c, -1), indices, counts)
        return grad_features, None, None

cpp

前向计算中的coords其实是vox_coords,我们再次跟到c++的程序中,pvcnn-master/modules/functional/src/voxelization/vox.cpp:

/*
  Function: average pool voxelization (forward)
  Args:
    features: features, FloatTensor[b, c, n]
    coords  : coords of each point, IntTensor[b, 3, n]
    resolution : voxel resolution
  Return:
    out : outputs, FloatTensor[b, c, s], s = r ** 3
    ind : voxel index of each point, IntTensor[b, n]
    cnt : #points in each voxel index, IntTensor[b, s]
*/
std::vector<at::Tensor> avg_voxelize_forward(const at::Tensor features,
                                             const at::Tensor coords,
                                             const int resolution) {
   
  CHECK_CUDA(features);
  CHECK_CUDA(coords);
  CHECK_CONTIGUOUS(features);
  CHECK_CONTIGUOUS(coords);
  CHECK_IS_FLOAT(features);
  CHECK_IS_INT(coords);

  int b = features.size(0);
  int c = features.size(1);
  int n = features.size(2);
  int r = resolution;
  int r2 = r * r;
  int r3 = r2 * r;
  // 在显存开要输出的变量的空间
  at::Tensor ind = torch::zeros(
      {
   b, n}, at::device(features.device()).dtype(at::ScalarType::Int));
  at::Tensor out = torch::zeros(
      {
   b, c, r3}, at::device(features.device()).dtype(at::ScalarType::Float));
  at::Tensor cnt = torch::zeros(
      {
   b, r3}, at::device(features.device()).dtype(at::ScalarType::Int));
  // 调用cuda写的函数
  avg_voxelize(b, c, n, r, r2, r3, coords.data_ptr<int>(),
               features.data_ptr<float>(), ind.data_ptr<int>(),
               cnt.data_ptr<int>(), out.data_ptr<float>());
  
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值