文章目录
1. What
For scene construction and novel view synthesis, this paper imposes geometric constraints on Gaussian representing the road and sky regions, leverages 3D templates to initialize the foreground points, and introduces a reflected Gaussian consistency to supervise the unseen sides of the foreground objects. Moreover, it uses residual spherical harmonics for foreground objects. Finally, it achieves sota on Pandaset and KITTI datasets in both construction tasks and novel view synthesis with lateral ego-vehicle trajectory adjustments.
2. Why
Question of PVG: this method does not tackle the simulation of novel scenarios, such as ego-vehicle lane changes and adjusting object trajectories(It can’t edit scenes).
3. How
3.1 Input
A series of N images ( I i I_i Ii) taken by a camera with its corresponding intrinsic ( K i K_i Ki) and extrinsic ( E i E_i Ei) matrices, along with the 3D LiDAR point clouds L i L_i Li **and corresponding dynamic objects trajectories T i T_i Ti.
3.2 Background Reconstruction
-
The road and sky regions are decomposed from the rest of the background using semantic masks.
-
By projecting LiDAR points to the image plane at each time step i i i, each Gaussian is assigned to one of the road, sky, and other class.
-
When splatting road and sky Gaussians, these Gaussians are constrained to be flat by minimizing their roll and pitch angles as well as their vertical scale.
-
Finally, the loss was defined as:
L B G = ( 1 − λ ) L 1 ( I g , I ^ g ) + λ L D S S I M ( I g , I ^ g ) + β C g g ∈ { r o a d , s k y , o t h e r } C g = { 1 N g ∑ i = 1 N g ( ∣ ϕ i ∣ + ∣ θ i ∣ + ∣ s z i ∣ ) i f g ∈ { r o a d , s k y } 0 e l s e \mathcal{L}_{BG}=(1-\lambda)\mathcal{L}_{1}(I_{g},\hat{I}_{g})+\lambda\mathcal{L}_{DSSIM}(I_{g},\hat{I}_{g})+\beta\mathcal{C}_{g}\quad g\in\{road,sky,other\}\\\mathcal{C}_{g}=\begin{cases}\frac{1}{N_g}\sum_{i=1}^{N_g}\left(|\phi_i|+|\theta_i|+|s_{z_i}|\right)&\mathrm{if} g\in\{road,sky\}\\0&\mathrm{else}\end{cases} LBG=(1−λ)L1(Ig,I^g)+λL