【读论文】SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_62012485/article/details/137046415

SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes

文章目录

SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes

1. What

What kind of thing is this article going to do (from the abstract and conclusion, try to summarize it in one sentence)

Inputting a monocular dynamic video, this paper uses sparse control points to drive Gaussian. Each control point has 6 DoF which is time-varying and can be predicted by a MLP. This method can enable dynamic view synthesis and motion editing but still has some limitations in inaccurate poses or intense movements.

2. Why

Under what conditions or needs this research plan was proposed (Intro), what problems/deficiencies should be solved at the core, what others have done, and what are the innovation points? (From Introduction and related work)

Maybe contain Background, Question, Others, Innovation:

Nerf-based methods struggle with low rendering qualities, speeds, and high memory usage. Existing 3D-GS only applies to static scenes. An intuitive method [47] involves learning a flow vector for each 3D Gaussian, but it incurs a significant time cost for training and inference(The author of 47 is a co-author of this).

Related work:

Dynamic NeRF
Dynamic Gaussian Splatting
3D Deformation and Editing

This part is relatively unfamiliar. It introduces the traditional editing methods in graphics which focus on preserving the geometric details of 3D objects during the deformation process, containing some descriptors like Laplacian coordinates, Poisson equation, and cage-based approaches.

Recently, there have been other approaches that aim to edit the scene geometry learned from 2D images. This paper belongs to this class.

3. How

3.1 Sparse Control Points

We will first introduce the definition of control points, which is the core concept used in this article.

There are a set of sparse control points $\mathcal{P}=\{(p_{i}\in\mathbb{R}^{3},o_{i}\in\mathbb{R}^{+})\},i\in \{1,2,\cdots,N_{p}\}$ . And $o_i$ is a learnable radius parameter that controls how the impact of a control point on a Gaussian.

Meanwhile, for each control point $k$ , we learn time-varying 6 DoF transformations $[R_i^t|T_i^t]\in\mathbf{SE}(3)$ , consisting of a local frame rotation matrix $R_i^t\in\mathbf{SO}(3)$ and a translation vector $T_i^t\in\mathbb{R}^3$ . But instead of directly optimizing the transformation parameters, we employ an MLP $\Psi$ to learn a time-varying transformation field:

$\Psi:(p_{i},t)\rightarrow(R_{i}^{t},T_{i}^{t}).$

3.2 Dynamic Scene Rendering

After having some control points, we need to found the connection between it with the Gaussian.

We use the k-nearest neighbor (KNN) search to obtain its K(= 4) neighboring control points denoted as $\{p_{k}|k\in\mathcal{N}_{j}\}$ . Then define its weight as: