PSAvatar: A Point-based Morphable Shape Model for Real-Time Head Avatar Creation with 3D Gaussian Splatting
PSAvatar:一种基于点的可变形形状模型,用于3D高斯溅射的实时头部化身创建
赵中原 1,2 、鲍振宇 1,2 、李庆 1 、邱国平 3,4 、刘康林 1
1 Pengcheng Laboratory 2 Peking University 3 University of Nottingham 4 Shenzhen University
1 鹏程实验室 2 北京大学 3 诺丁汉大学 4 深圳大学
Abstract 摘要 PSAvatar: A Point-based Morphable Shape Model for Real-Time Head Avatar Creation with 3D Gaussian Splatting
Despite much progress, achieving real-time high-fidelity head avatar animation is still difficult and existing methods have to trade-off between speed and quality. 3DMM based methods often fail to model non-facial structures such as eyeglasses and hairstyles, while neural implicit models suffer from deformation inflexibility and rendering inefficiency. Although 3D Gaussian has been demonstrated to possess promising capability for geometry representation and radiance field reconstruction, applying 3D Gaussian in head avatar creation remains a major challenge since it is difficult for 3D Gaussian to model the head shape variations caused by changing poses and expressions. In this paper, we introduce PSAvatar 1, a novel framework for animatable head avatar creation that utilizes discrete geometric primitive to create a parametric morphable shape model and employs 3D Gaussian for fine detail representation and high fidelity rendering. The parametric morphable shape model is a Point-based Morphable Shape Model (PMSM) which uses points instead of meshes for 3D representation to achieve enhanced representation flexibility. The PMSM first converts the FLAME mesh to points by sampling on the surfaces as well as off the meshes to enable the reconstruction of not only surface-like structures but also complex geometries such as eyeglasses and hairstyles. By aligning these points with the head shape in an analysis-by-synthesis manner, the PMSM makes it possible to utilize 3D Gaussian for fine detail representation and appearance modeling, thus enabling the creation of high-fidelity avatars. We show that PSAvatar can reconstruct high-fidelity head avatars of a variety of subjects and the avatars can be animated in real-time (≥ 25 fps at a resolution of 512 × 512 )2.
尽管取得了很大进展,但实现实时高保真头部化身动画仍然很困难,现有方法必须在速度和质量之间进行权衡。基于3DMM的方法通常无法对眼镜和发型等非面部结构进行建模,而神经隐式模型则存在变形不确定性和渲染效率低下的问题。虽然3D高斯已被证明具有良好的几何表示和辐射场重建的能力,应用3D高斯在头部化身创建仍然是一个主要的挑战,因为它是困难的3D高斯模型的头部形状变化所造成的姿势和表情。在本文中,我们介绍了PSAvatar,一种新的框架,利用离散几何图元创建一个参数化的变形形状模型,并采用3D高斯精细的细节表示和高保真渲染的动画头部化身创建。 参数化可变形形状模型是一种基于点的可变形形状模型(PMSM),它使用点代替网格进行3D表示,以实现增强的表示灵活性。PMSM首先通过在表面上采样以及在网格外采样将FLAME网格转换为点,从而不仅能够重建类似表面的结构,还能够重建复杂的几何形状,例如眼镜和发型。通过以合成分析的方式将这些点与头部形状对齐,PMSM可以利用3D高斯进行精细细节表示和外观建模,从而能够创建高保真化身。我们表明,PSAvatar可以重建各种主题的高保真头部化身,并且化身可以实时动画( ≥ 25 fps,分辨率为512 × 512) 2 。
![[Uncaptioned image]](https://i-blog.csdnimg.cn/blog_migrate/38a3fa25273bf103b7c1a5095e2fbc62.png)
Figure 1:PSAvatar learns the shape with pose and expression variations based on a point-based morphable shape model, and employs 3D Gaussian for fine detail representation and efficient rendering. Given monocular portrait videos, PSAvatar can create head avatars that enable real-time (≥ 25 fps at 512 × 512 resolution) and high-fidelity rendering.
图一:PSAvatar基于基于点的变形形状模型学习具有姿势和表情变化的形状,并采用3D高斯进行精细细节表示和高效渲染。对于单眼肖像视频,PSAvatar可以创建头部化身,实现实时( ≥ 25 fps,512 × 512分辨率)和高保真渲染。
1Introduction 1介绍
Creating animatable head avatars has wide applications and has attracted extensive interests in academia and industries. Many methods based on explicit representations, e.g., 3D morphable models (3DMMs) [1, 21], points [41, 35] and more recently 3D Gaussian [17, 25, 3]), and neural implicit representations, e.g., Neural Radiance Field (NeRF) [22, 10, 42] and signed distance function (SDF) [37, 40]), have been developed in recent years. Whilst these methods have achieved very impressive results, there are still many unsolved problems.
创建可动画化的头部化身具有广泛的应用,并且在学术界和工业界引起了广泛的兴趣。许多方法基于显式表示,例如,3D变形模型(3DMM)[1,21],点[41,35]和最近的3D高斯[17,25,3]),以及神经隐式表示,例如,神经辐射场(NeRF)[22,10,42]和符号距离函数(SDF)[37,40])是近年来开发的。虽然这些方法已经取得了令人印象深刻的成果,但仍然有许多未解决的问题。
3DMM-based methods allow efficient rasterization and inherently generalize to unseen deformations, but are limited by a priori-fixed topology and surface-like geometries, making them less suitable for modeling individuals with eyeglasses or complex hairstyles [3, 25]. Whilst neural implicit representations outperform 3DMM-based methods in capturing hair strands and eyeglasses [40, 6], they are computationally extremely demanding [15]. Furthermore, neural implicit representations need the deformer network or similar techniques to bridge the gap between the canonical and deformed spaces, making it challenging to achieve high deformation accuracy.
基于3DMM的方法允许有效的光栅化,并且固有地概括为不可见的变形,但是受到优先级固定的拓扑结构和表面状几何形状的限制,使得它们不太适合对戴眼镜或复杂发型的个体进行建模[3,25]。虽然神经隐式表示在捕获发丝和眼镜方面优于基于3DMM的方法[40,6],但它们在计算上要求极高[15]。此外,神经隐式表示需要变形器网络或类似技术来弥合规范空间和变形空间之间的差距,这使得实现高变形精度具有挑战性。
In contrast to neural implicit representations, both point and 3D Gaussian representations can be rendered efficiently with a splatting-based rasterization [41, 3, 25], and both are considerably more flexible than 3DMMs in representing complex volumetric structures, e.g., eyeglass, hair strands, etc.. PointAvatar [41] initializes with a sparse point cloud randomly sampled on a sphere and periodically upsamples the point cloud by adding noises. The position of the points are updated to match the target geometry via backwards gradients. Points are rotation-invariant and isotropically scaled, making them easy to control. In comparison, 3D Gaussians can be rotated and scaled, making them more flexible than points for 3D representation. In order to achieve consistent 3D representations, 3D Gaussian rely on carefully designed controlling strategy. In GaussianAvatar [25], each triangle of the mesh is initialized with a 3D Gaussian, and the positional gradient is utilized to move and periodically densify the Gaussian splats. A major difficulty in applying 3D Gaussian to head avatar creation is modeling the head shape variations caused by changing poses and expressions.
与神经隐式表示相比,点和3D高斯表示都可以用基于分裂的光栅化有效地渲染[41,3,25],并且在表示复杂的体积结构方面都比3DMM灵活得多,例如,头发丝等。PointAvatar [41]使用在球体上随机采样的稀疏点云进行建模,并通过添加噪声定期对点云进行上采样。点的位置通过向后梯度更新以匹配目标几何形状。点是旋转不变和各向同性缩放的,使其易于控制。相比之下,3D高斯可以旋转和缩放,使它们比3D表示的点更灵活。为了实现一致的3D表示,3D高斯依赖于精心设计的控制策略。 在GaussianAvatar [25]中,网格的每个三角形都使用3D高斯进行初始化,并且位置梯度用于移动和周期性地致密高斯splats。将3D高斯应用于头部化身创建的主要困难是对由改变姿势和表情引起的头部形状变化进行建模。
In this paper, we introduce PSAvatar, a novel framework for animatable head avatar creation that utilizes discrete geometric primitive to create a parametric morphable shape model to make it possible to employ 3D Gaussian for fine detail representation and high fidelity rendering. Such a parametric morphable shape model, referred to as Point-based Morphable Shape Model (PMSM), relies on points instead of meshes for 3D representation to achieve enhanced representation flexibility. PMSM is created based on FLAME to inherit its morphable capability. Specifically, PMSM converts the FLAME mesh to points by uniformly sampling points on the surface of the mesh. However, FLAME is incapable of representing individuals with eyeglasses or complex hairstyles. To address this, PMSM samples points off the FLAME mesh to enhance the representation flexibility. PMSM splats the points onto screen and minimizes the difference between the rendered and ground truth images. After removing the invisible points, the remaining points are then aligned with the head shape. PSAvatar models the appearance by employing 3D Gaussian in combination with the PMSM to reconstruct the underlying radiance field and to achieve high-fidelity rendering. Our contributions are as follows:
在本文中,我们介绍PSAvatar,一个新的框架,利用离散的几何图元创建一个参数化的变形形状模型,使之有可能采用3D高斯精细的细节表示和高保真渲染的动画头部化身创建。这种参数化的可变形形状模型,称为基于点的可变形形状模型(PMSM),依赖于点而不是网格来进行3D表示,以实现增强的表示灵活性。永磁同步电机是在FLAME的基础上创建的,继承了FLAME的变形能力。具体来说,PMSM通过对网格表面上的点进行均匀采样,将FLAME网格转换为点。然而,FLAME无法代表戴眼镜或发型复杂的个人。为了解决这个问题,PMSM采样点离开FLAME网格,以提高表示的灵活性。PMSM将点显示在屏幕上,并最大限度地减少渲染图像和地面实况图像之间的差异。 在移除不可见的点之后,剩余的点然后与头部形状对齐。PSAvatar通过采用3D高斯模型结合PMSM来重建底层辐射场并实现高保真渲染。 我们的贡献如下:
- •
We present PSAvatar, a method for creating animatable head avatars using a point-based morphable shape model for shape modeling and employing 3D Gaussian for fine detail representation and appearance modeling.
·我们提出了PSAvatar,一种使用基于点的可变形形状模型进行形状建模并采用3D高斯进行精细细节表示和外观建模来创建可动画化头部化身的方法。 - •

实现实时高保真头部化身动画困难,现有方法需在速度和质量间权衡。本文介绍PSAvatar框架,利用离散几何图元创建参数化变形形状模型(PMSM),采用3D高斯进行精细细节表示和高保真渲染。PMSM用点代替网格,能重建复杂几何形状,可创建高保真且能实时动画的头部化身。
最低0.47元/天 解锁文章

被折叠的 条评论
为什么被折叠?



