【读论文】Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_62012485/article/details/137370579

本文介绍了一种利用周期振动表示动态场景特征的方法，提出了一种新的时间平滑机制和位置感知的自适应控制策略。通过结合图像、LiDAR点云和光学流，研究解决了如何处理动态场景中的时空一致性问题，特别强调了在自动驾驶等场景中动态元素的处理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

1. What

What kind of thing is this article going to do (from the abstract and conclusion, try to summarize it in one sentence)

Utilizing periodic vibration-based temporal dynamics to represent the characteristics of various objects and elements in dynamic urban scenes. A novel temporal smoothing mechanism and a position-aware adaptive control strategy were also introduced to handle temporally coherent. Finish the reconstruction of dynamic scenes.

2. Why

Under what conditions or needs this research plan was proposed (Intro), what problems/deficiencies should be solved at the core, what others have done, and what are the innovation points? (From Introduction and related work)

Maybe contain Background, Question, Others, Innovation:

Introduction:

NSG: Decomposes dynamic scenes into scene graphs
PNF: Decomposes scenes into objects and backgrounds, incorporating a panoptic segmentation auxiliary task.
SUDS: Use optical flow(Optical flow helps in identifying which parts of the scene are static and which are dynamic)
EmerNeRF: Use a self-supervised method to reduce dependence on optical flow.

Related work:

Dynamic scene models
- In one research direction, certain studies [2, 7, 11, 18, 38] introduce time as an additional input to the radiance field, treating the scene as a 6D plenoptic function. However, this approach couples positional variations induced by temporal dynamics with the radiance field, lacking geometric priors about how time influences the scene.
- An alternative approach [1, 20, 24–26, 33, 40] focuses on modeling the movement or deformation of specific static structures, assuming that the dynamics arise from these static elements within the scene.
- Gaussian-based
Urban scene reconstruction

One research avenue(NeRF-based) has focused on enhancing the modeling of static street scenes by utilizing scalable representations [19, 28, 32, 34], achieving high-fidelity surface reconstruction [14, 28, 39], and incorporating multi-object composition [43]. However, these methods face difficulties in handling dynamic elements commonly encountered in autonomous driving contexts.
Another research direction seeks to address these challenges. Notably, these techniques require additional input, such as leveraging panoptic segmentation to refine the dynamics of reconstruction [PNF]. [Street Gaussians, Driving Gaussian] decompose the scene with different sets of Gaussian points by bounding boxes. However, they all need manually annotated or predicted bounding boxes and have difficulty reconstructing the non-rigid objects.

3. How

The input data contain images, represented as $\{\mathcal{I}_{i},t_{i},\mathbf{E}_{i},\mathbf{I}_{i}|i=1,2,\ldots N_{c}\}$ and LiDAR point clouds represented as $\{(x_i,y_i,z_i,t_i)|i=1,2,\ldots N_l\}$ . The rendering process can be represented as $\hat{\mathcal{I}}=\mathcal{F}_{\theta}(\mathbf{E}_{o},\mathbf{I}_{o},t)$ , which shows the image at any timestamp $t$ and camera pose $(\mathbf{E}_{o},\mathbf{I}_{o})$