PU-Net: a Deep Learning Network application in 3D Point Cloud Upsampling

本文深入探讨了Xinzhi Li的工作,介绍了PU-Net如何利用深度网络进行3D点云上采样,克服了点云稀疏性和不规则性的挑战。通过详述PU-Net的训练数据准备、点云点数扩张和损失函数构造,展示了其在保持点分布均匀性和信息丰富性方面的创新。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Brief

About 3D point cloud up-sampling from Xianzhi Li’s work:

Detail

Backgroud

Due to the sparseness and irregularity of the point cloud, learning a deep net on point cloud remains a challenging work. Recent work have been trying to accomplish upsampling based on some prior knowledge and assumption, also some external input and information such as normal vector. What’s more, some works trying to extract features directly from the point cloud are always running into problem of missing semantic information and getting a different shape of point cloud from the original one. Since semantic information can be captured through deep net, it came to authors’ mind that maybe they could bring about a break-through in point upsampling by using the deep net to extract features from target point cloud.

PU-Net

Challenges in learning features from point cloud with deep net:

  • How to prepared enough training dataset
  • How to expand the number of points
  • How to design loss function

Dataset Preparation

Due to there are only mesh data available, they then decide to create their training data from the those data:

  • As upsampling can be treated as a work on local region of a image, firstly they splits the mesh data into several separated parts, regrading each of which as a patch
  • Then, they transform mesh surface into dense point cloud through poisson disk sampling. By this they can obtain their ground truth
  • Last is to produce input from the ground truth. Since there are more than exacly one input corresponding, on-the-fly inputs are generated by randomly sampled from the ground truth point sets with a fixed downsampling rate r.

Figure.1 from mesh to densy point cloud as ground truth

With 40 mesh splitted into 1000 patches, a 4k-size training dataset is now available, with each patch consists of input and corresponding ground truth.

Expansion of Points in Point Cloud

First of all, we need to extract features from the local region in the point cloud, required to perform the extraction on each point in the cloud since local features is expected for solution of upsampling problem. They construct a network similar to PointNet++. As shown in the following figure, we can see that features are extracted from different levels of resolution of the point clouds generated by a exponential downsampling starting from the original one. In each layer, green points are generated by interpolation from nearest red points. Only the output of the last layer is accepted in PointNet++. However, as we have been mentioning above, local feature is required in upsampling problem, so in PU-Net they concat each feature map attained from each layer to produce the final output of this hierarchical feature learning network.

Figure.2 PU-Net hierarchical feature learning which is more helpful for upsampling task

The expansion is carried out in feature space. That means they do not directly expand points in the point cloud according to the features extracted, instead they expand feature map using different convolutional kernel in feature space, reshape and finally regress the 3D coordinates by a fully connected layer. The expansion is shown in the following picture:

Figure.3 Expansion in feature space from NxC to NxrC

The expansion operation can be represented as the following function:
f ′ = R S ( [ C 1 2 ( C 1 1 ( f ) ) , … , C r 2 ( C r 1 ( f ) ) ] ) f'=\mathcal{RS([C_{1}^{2}(C_{1}^{1}(f)),\dots,C_{r}^{2}(C_{r}^{1}(f))])} f=RS([C12(C11(f)),,Cr2(Cr1(f))])
in which:

  • R S \mathcal{RS} RS is the reshape function
  • C r 2 , C r 1 \mathcal{C_{r}^{2},C_{r}^{1}} Cr2,Cr1 represent the second time and first time convolution with kernel r alternatively

Note that two times of convolution is performed in order to break points’ correlation, 'cause points generated from the same feature map, although different convolutional kernel is applied, usually gather togather. It’s much better to use different convolutional kernel to perform a two-time convolution to ensure a much more uniform generation.

Construction of Loss Function

Two basic requirement:

  • points generated should be uniform
  • points generated should be informative and should not cluster

Two loss functions are designed to ensure satisfactory points’ distribution listed above.

The First one is call reconstruction loss using an Earth Movers Distance, namely EMD, which is famous for evaluating least distance to transform one distribution to another. By means of this evaluation, points generated will be engaged to be on the surface and outliers will be punished, gradually moving towards the surface through iterations. The loss function can be represented as follow:
L r e c = d E M D ( S p , s g t ) = min ⁡ Ø : S p → S g t ∑ x i ∈ S p ∣ ∣ x i − Ø ( x i ) ∣ ∣ 2 L_{rec}=d_{EMD}(S_{p},s_{gt})=\min_{\text\O:S_{p}\rightarrow S_{gt}}\sum_{x_{i}\in S_{p}}||x_{i}-\text\O(x_{i})||_{2} Lrec=dEMD(Sp,sgt)=Ø:SpSgtminxiSpxiØ(xi)2
with:

  • S p S_{p} Sp is the predicted point and S g t S_{gt} Sgt is the ground truth
  • Ø : S p → S g t \text\O:S_{p}\rightarrow S_{gt} Ø:SpSgt inddicates the bijection mapping

The second one is call repulsion loss which will punish those points clustering to ensure a much more uniform distribution. The loss function can be represented as follow:
L r e p = ∑ i = 0 N ^ ∑ i ′ ∈ K ( i ) η ( ∣ ∣ x i ′ − x i ∣ ∣ ) w ( ∣ ∣ x i ′ − x i ∣ ∣ ) L_{rep}=\sum_{i=0}^{\hat{N}}\sum_{i'\in K(i)}\eta(||x_{i'}-x_{i}||)w(||x_{i'}-x_{i}||) Lrep=i=0N^iK(i)η(xixi)w(xixi)
with:

  • N ^ \hat N N^ is the number of output points
  • K ( i ) K(i) K(i) are the k nearest point of x_{i}
  • repulsion term: η ( r ) = − r \eta(r)=-r η(r)=r
  • fast-decaying weight function: w ( r ) = e − r 2 / h 2 w(r)=e^{-r^{2}/h^{2}} w(r)=er2/h2, which will decrease over downsampling

Contribution

Their work is first to apply deep net in point cloud upsampling, which can capture much more features in the point cloud and present us a better solution compared with those traditional solution which directly extract features from point cloud.

Consideration

As Xinzhi Li showed in GAMES Webinar 120, she told us that PU-Net, although having performed well in upsampling point cloud generation, doesn’t have the capability of edge detection resulting in a tough surface on some regular object like legs of a chair. That’s why they came up with a new work in the same year called EC-Net, namely Edge-aware Point set Consolidation Network, which was accepted in ECCV 2018.

### 基于深度学习的 Turbo Autoencoder 在点对点通信信道中的实现与性能 #### 设计原理 Turbo Autoencoder (TurboAE) 是一种利用深度学习技术优化传统 Turbo 编码器的方法。通过引入自动编码框架,能够更好地适应不同类型的噪声环境并提高传输效率[^1]。 #### 结构设计 TurboAE 的架构由两部分组成:发送端和接收端。发送端负责将原始数据映射到适合传输的形式;而接收端则用于解码接收到的信息流。具体来说,在发送侧采用卷积层来构建编码函数,并使用全连接网络作为中间表示形式的一部分。对于接收机而言,则是反向操作——先经过一系列线性变换再施加激活函数完成最终决策过程。 #### 训练方法 为了使模型具备良好的泛化能力以及鲁棒特性,训练过程中不仅考虑了标准的最大似然准则还加入了对抗损失项以增强其抵御干扰的能力。此外,特别针对二进制输入情况提出了基于直通估计器的技术方案,使得整个系统可以在不牺牲精度的前提下有效处理离散信号源问题。 #### 性能评估 实验结果显示,在相同条件下对比经典算法时,所提出的 TurboAE 方法能够在更短的时间内达到更低误比特率(BER),尤其是在非高斯白噪(AWGN)环境下表现尤为突出。这表明该种新型编译方式具有广阔的应用前景和发展潜力。 ```python import tensorflow as tf from tensorflow.keras import layers, models def build_turboae_model(input_shape=(None,), num_classes=2): inputs = layers.Input(shape=input_shape) # Encoder part of the TurboAE model encoded = layers.Conv1D(filters=64, kernel_size=3, activation='relu')(inputs) encoded = layers.MaxPooling1D(pool_size=2)(encoded) bottleneck = layers.Dense(32, activation="sigmoid")(encoded) decoded = layers.UpSampling1D(size=2)(bottleneck) decoded = layers.Conv1DTranspose(filters=64, kernel_size=3, activation='relu')(decoded) outputs = layers.Dense(num_classes, activation='softmax')(decoded) turboae = models.Model(inputs, outputs) return turboae ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值