Point-Voxel CNN for Efficient 3D Deep Learning, 2019 NIPS
code:https://github.com/mit-han-lab/pvcnn
文章的的解读可以看我另一篇博客。
PVconv的具体实现在pvcnn-master/modules/pvconv.py
voxel_features, voxel_coords = self.voxelization(features, coords)
voxel_features = self.voxel_layers(voxel_features)
voxel_features = F.trilinear_devoxelize(voxel_features, voxel_coords, self.resolution, self.training)
fused_features = voxel_features + self.point_features(features)
Voxelization
python
首先,根据上面代码可以看到,是把point-wise的feature和coords传入,那我们也跟进去看,pvcnn-master/modules/voxelization.py:
class Voxelization(nn.Module):
def __init__(self, resolution, normalize=True, eps=0):
super().__init__()
self.r = int(resolution)
self.normalize = normalize
self.eps = eps
def forward(self, features, coords)
coords = coords.detach()
# 把coords归一到局部坐标系中,先减去均值
norm_coords = coords - coords.mean(2, keepdim=True)
if self.normalize:
# 找到最远的点当做半径,然后每个点除以2*半径,将坐标归一到[-0.5,0.5],然后加上0.5
norm_coords = norm_coords / (norm_coords.norm(dim=1, keepdim=True).max(dim=2, keepdim=True).values * 2.0 + self.eps) + 0.5
else:
norm_coords = (norm_coords + 1) / 2.0
# resolution是正整数,将norm_coords从[0,1]放大到[0,r-1]
norm_coords = torch.clamp(norm_coords * self.r, 0, self.r - 1)
# 通过round,得到vox_coords,vox_coords的取值是[0,r-1]的整数,一共r个值
vox_coords = torch.round(norm_coords).to(torch.int32)
# 前向计算,进行voxelize
return F.avg_voxelize(features, vox_coords, self.r), norm_coords
把feature和vox_coords传入,那么我们也跟进去看,pvcnn-master/modules/functional/voxelization.py:
class AvgVoxelization(Function):
@staticmethod
def forward(ctx, features, coords, resolution):
"""
:param ctx:
:param features: Features of the point cloud, FloatTensor[B, C, N]
:param coords: Voxelized Coordinates of each point, IntTensor[B, 3, N]
:param resolution: Voxel resolution
:return:
Voxelized Features, FloatTensor[B, C, R, R, R]
"""
features = features.contiguous()
coords = coords.int().contiguous()
b, c, _ = features.shape
# 前向计算
out, indices, counts = _backend.avg_voxelize_forward(features, coords, resolution)
ctx.save_for_backward(indices, counts)
return out.view(b, c, resolution, resolution, resolution)
@staticmethod
def backward(ctx, grad_output):
"""
:param ctx:
:param grad_output: gradient of output, FloatTensor[B, C, R, R, R]
:return:
gradient of inputs, FloatTensor[B, C, N]
"""
b, c = grad_output.shape[:2]
indices, counts = ctx.saved_tensors
# 反向传播
grad_features = _backend.avg_voxelize_backward(grad_output.contiguous().view(b, c, -1), indices, counts)
return grad_features, None, None
cpp
前向计算中的coords其实是vox_coords,我们再次跟到c++的程序中,pvcnn-master/modules/functional/src/voxelization/vox.cpp:
/*
Function: average pool voxelization (forward)
Args:
features: features, FloatTensor[b, c, n]
coords : coords of each point, IntTensor[b, 3, n]
resolution : voxel resolution
Return:
out : outputs, FloatTensor[b, c, s], s = r ** 3
ind : voxel index of each point, IntTensor[b, n]
cnt : #points in each voxel index, IntTensor[b, s]
*/
std::vector<at::Tensor> avg_voxelize_forward(const at::Tensor features,
const at::Tensor coords,
const int resolution) {
CHECK_CUDA(features);
CHECK_CUDA(coords);
CHECK_CONTIGUOUS(features);
CHECK_CONTIGUOUS(coords);
CHECK_IS_FLOAT(features);
CHECK_IS_INT(coords);
int b = features.size(0);
int c = features.size(1);
int n = features.size(2);
int r = resolution;
int r2 = r * r;
int r3 = r2 * r;
// 在显存开要输出的变量的空间
at::Tensor ind = torch::zeros(
{
b, n}, at::device(features.device()).dtype(at::ScalarType::Int));
at::Tensor out = torch::zeros(
{
b, c, r3}, at::device(features.device()).dtype(at::ScalarType::Float));
at::Tensor cnt = torch::zeros(
{
b, r3}, at::device(features.device()).dtype(at::ScalarType::Int));
// 调用cuda写的函数
avg_voxelize(b, c, n, r, r2, r3, coords.data_ptr<int>(),
features.data_ptr<float>(), ind.data_ptr<int>(),
cnt.data_ptr<int>(), out.data_ptr<float>());