PVCNN：点云处理的深度学习实现解析-优快云博客

本文详细解析了Point-Voxel CNN (PVCNN) 的工作原理，包括Voxelization和Devoxelization过程。通过阅读PVCNN的源代码，从python、cpp到cu，深入探讨了前向计算和反向传播的实现细节，特别是Voxelization中的坐标转换和Devoxelization中的三线性插值方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

Voxelization
- python
- cpp
- cu
- - 前向计算
  - 反向传播
devoxelization
- python
- cpp
- cu
- - 前向计算
  - 反向传播

Point-Voxel CNN for Efficient 3D Deep Learning, 2019 NIPS
code：https://github.com/mit-han-lab/pvcnn

文章的的解读可以看我另一篇博客。

PVconv的具体实现在pvcnn-master/modules/pvconv.py

voxel_features, voxel_coords = self.voxelization(features, coords)
voxel_features = self.voxel_layers(voxel_features)
voxel_features = F.trilinear_devoxelize(voxel_features, voxel_coords, self.resolution, self.training)
fused_features = voxel_features + self.point_features(features)

Voxelization

python

首先，根据上面代码可以看到，是把point-wise的feature和coords传入，那我们也跟进去看，pvcnn-master/modules/voxelization.py：

class Voxelization(nn.Module):
    def __init__(self, resolution, normalize=True, eps=0):
        super().__init__()
        self.r = int(resolution)
        self.normalize = normalize
        self.eps = eps

    def forward(self, features, coords)
        coords = coords.detach()
		# 把coords归一到局部坐标系中，先减去均值
        norm_coords = coords - coords.mean(2, keepdim=True)
        if self.normalize:
        	# 找到最远的点当做半径，然后每个点除以2*半径，将坐标归一到[-0.5,0.5]，然后加上0.5
            norm_coords = norm_coords / (norm_coords.norm(dim=1, keepdim=True).max(dim=2, keepdim=True).values * 2.0 + self.eps) + 0.5
        else:
            norm_coords = (norm_coords + 1) / 2.0
        # resolution是正整数，将norm_coords从[0,1]放大到[0,r-1]
        norm_coords = torch.clamp(norm_coords * self.r, 0, self.r - 1)
        # 通过round，得到vox_coords，vox_coords的取值是[0,r-1]的整数，一共r个值
        vox_coords = torch.round(norm_coords).to(torch.int32)
        # 前向计算，进行voxelize
        return F.avg_voxelize(features, vox_coords, self.r), norm_coords

把feature和vox_coords传入，那么我们也跟进去看，pvcnn-master/modules/functional/voxelization.py：

class AvgVoxelization(Function):
    @staticmethod
    def forward(ctx, features, coords, resolution):
        """
        :param ctx:
        :param features: Features of the point cloud, FloatTensor[B, C, N]
        :param coords: Voxelized Coordinates of each point, IntTensor[B, 3, N]
        :param resolution: Voxel resolution
        :return:
            Voxelized Features, FloatTensor[B, C, R, R, R]
        """
        features = features.contiguous()
        coords = coords.int().contiguous()
        b, c, _ = features.shape
        # 前向计算
        out, indices, counts = _backend.avg_voxelize_forward(features, coords, resolution)
        ctx.save_for_backward(indices, counts)
        return out.view(b, c, resolution, resolution, resolution)

    @staticmethod
    def backward(ctx, grad_output):
        """
        :param ctx:
        :param grad_output: gradient of output, FloatTensor[B, C, R, R, R]
        :return:
            gradient of inputs, FloatTensor[B, C, N]
        """
        b, c = grad_output.shape[:2]
        indices, counts = ctx.saved_tensors
        # 反向传播
        grad_features = _backend.avg_voxelize_backward(grad_output.contiguous().view(b, c, -1), indices, counts)
        return grad_features, None, None

cpp

前向计算中的coords其实是vox_coords，我们再次跟到c++的程序中，pvcnn-master/modules/functional/src/voxelization/vox.cpp：

/*
  Function: average pool voxelization (forward)
  Args:
    features: features, FloatTensor[b, c, n]
    coords  : coords of each point, IntTensor[b, 3, n]
    resolution : voxel resolution
  Return:
    out : outputs, FloatTensor[b, c, s], s = r ** 3
    ind : voxel index of each point, IntTensor[b, n]
    cnt : #points in each voxel index, IntTensor[b, s]
*/
std::vector<at::Tensor> avg_voxelize_forward(const at::Tensor features,
                                             const at::Tensor coords,
                                             const int resolution) {
   
  CHECK_CUDA(features);
  CHECK_CUDA(coords);
  CHECK_CONTIGUOUS(features);
  CHECK_CONTIGUOUS(coords);
  CHECK_IS_FLOAT(features);
  CHECK_IS_INT(coords);

  int b = features.size(0);
  int c = features.size(1);
  int n = features.size(2);
  int r = resolution;
  int r2 = r * r;
  int r3 = r2 * r;
  // 在显存开要输出的变量的空间
  at::Tensor ind = torch::zeros(
      {
   b, n}, at::device(features.device()).dtype(at::ScalarType::Int));
  at::Tensor out = torch::zeros(
      {
   b, c, r3}, at::device(features.device()).dtype(at::ScalarType::Float));
  at::Tensor cnt = torch::zeros(
      {
   b, r3}, at::device(features.device()).dtype(at::ScalarType::Int));
  // 调用cuda写的函数
  avg_voxelize(b, c, n, r, r2, r3, coords.data_ptr<int>(),
               features.data_ptr<float>(), ind.data_ptr<int>(),
               cnt.data_ptr<int>(), out.data_ptr<float>());