PU-Net: a Deep Learning Network application in 3D Point Cloud Upsampling

最新推荐文章于 2022-04-07 20:12:29 发布

Ice星空

最新推荐文章于 2022-04-07 20:12:29 发布

阅读量912

点赞数

CC 4.0 BY-SA版权

分类专栏：深度学习神经网络

本文链接：https://blog.youkuaiyun.com/Lyn_B/article/details/103428340

深度学习同时被 3 个专栏收录

9 篇文章

订阅专栏

神经网络

9 篇文章

订阅专栏

2 篇文章

订阅专栏

本文深入探讨了Xinzhi Li的工作，介绍了PU-Net如何利用深度网络进行3D点云上采样，克服了点云稀疏性和不规则性的挑战。通过详述PU-Net的训练数据准备、点云点数扩张和损失函数构造，展示了其在保持点分布均匀性和信息丰富性方面的创新。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Brief

About 3D point cloud up-sampling from Xianzhi Li’s work:

Detail

Backgroud

Due to the sparseness and irregularity of the point cloud, learning a deep net on point cloud remains a challenging work. Recent work have been trying to accomplish upsampling based on some prior knowledge and assumption, also some external input and information such as normal vector. What’s more, some works trying to extract features directly from the point cloud are always running into problem of missing semantic information and getting a different shape of point cloud from the original one. Since semantic information can be captured through deep net, it came to authors’ mind that maybe they could bring about a break-through in point upsampling by using the deep net to extract features from target point cloud.

PU-Net

Challenges in learning features from point cloud with deep net:

How to prepared enough training dataset
How to expand the number of points
How to design loss function

Dataset Preparation

Due to there are only mesh data available, they then decide to create their training data from the those data:

As upsampling can be treated as a work on local region of a image, firstly they splits the mesh data into several separated parts, regrading each of which as a patch
Then, they transform mesh surface into dense point cloud through poisson disk sampling. By this they can obtain their ground truth
Last is to produce input from the ground truth. Since there are more than exacly one input corresponding, on-the-fly inputs are generated by randomly sampled from the ground truth point sets with a fixed downsampling rate r.

Figure.1 from mesh to densy point cloud as ground truth

With 40 mesh splitted into 1000 patches, a 4k-size training dataset is now available, with each patch consists of input and corresponding ground truth.

Expansion of Points in Point Cloud

First of all, we need to extract features from the local region in the point cloud, required to perform the extraction on each point in the cloud since local features is expected for solution of upsampling problem. They construct a network similar to PointNet++. As shown in the following figure, we can see that features are extracted from different levels of resolution of the point clouds generated by a exponential downsampling starting from the original one. In each layer, green points are generated by interpolation from nearest red points. Only the output of the last layer is accepted in PointNet++. However, as we have been mentioning above, local feature is required in upsampling problem, so in PU-Net they concat each feature map attained from each layer to produce the final output of this hierarchical feature learning network.

Figure.2 PU-Net hierarchical feature learning which is more helpful for upsampling task

The expansion is carried out in feature space. That means they do not directly expand points in the point cloud according to the features extracted, instead they expand feature map using different convolutional kernel in feature space, reshape and finally regress the 3D coordinates by a fully connected layer. The expansion is shown in the following picture:

Figure.3 Expansion in feature space from NxC to NxrC

The expansion operation can be represented as the following function:
$f'=\mathcal{RS([C_{1}^{2}(C_{1}^{1}(f)),\dots,C_{r}^{2}(C_{r}^{1}(f))])}$
in which:

$\mathcal{RS}$ is the reshape function
$\mathcal{C_{r}^{2},C_{r}^{1}}$ represent the second time and first time convolution with kernel r alternatively

Note that two times of convolution is performed in order to break points’ correlation, 'cause points generated from the same feature map, although different convolutional kernel is applied, usually gather togather. It’s much better to use different convolutional kernel to perform a two-time convolution to ensure a much more uniform generation.

Construction of Loss Function

Two basic requirement:

points generated should be uniform
points generated should be informative and should not cluster

Two loss functions are designed to ensure satisfactory points’ distribution listed above.

The First one is call reconstruction loss using an Earth Movers Distance, namely EMD, which is famous for evaluating least distance to transform one distribution to another. By means of this evaluation, points generated will be engaged to be on the surface and outliers will be punished, gradually moving towards the surface through iterations. The loss function can be represented as follow:
$L_{rec}=d_{EMD}(S_{p},s_{gt})=\min_{\text\O:S_{p}\rightarrow S_{gt}}\sum_{x_{i}\in S_{p}}||x_{i}-\text\O(x_{i})||_{2}$
with:

$S_{p}$ is the predicted point and $S_{gt}$ is the ground truth
$\text\O:S_{p}\rightarrow S_{gt}$ inddicates the bijection mapping

The second one is call repulsion loss which will punish those points clustering to ensure a much more uniform distribution. The loss function can be represented as follow:
$L_{rep}=\sum_{i=0}^{\hat{N}}\sum_{i'\in K(i)}\eta(||x_{i'}-x_{i}||)w(||x_{i'}-x_{i}||)$
with:

$\hat N$ is the number of output points
$K (i)$ are the k nearest point of x_{i}
repulsion term: $\eta(r)=-r$
fast-decaying weight function: $w(r)=e^{-r^{2}/h^{2}}$ , which will decrease over downsampling

Contribution

Their work is first to apply deep net in point cloud upsampling, which can capture much more features in the point cloud and present us a better solution compared with those traditional solution which directly extract features from point cloud.

Consideration

As Xinzhi Li showed in GAMES Webinar 120, she told us that PU-Net, although having performed well in upsampling point cloud generation, doesn’t have the capability of edge detection resulting in a tough surface on some regular object like legs of a chair. That’s why they came up with a new work in the same year called EC-Net, namely Edge-aware Point set Consolidation Network, which was accepted in ECCV 2018.