VR系列——Oculus Audio sdk文档：一、虚拟现实音频技术简介（5）——环境建模

虚拟现实中的声音设计

最新推荐文章于 2024-12-31 00:00:00 发布

翻译最新推荐文章于 2024-12-31 00:00:00 发布 · 2.2k 阅读

文章标签：

#虚拟现实

VR 专栏收录该内容

83 篇文章

订阅专栏

本文探讨了虚拟现实中声音设计的关键技术，包括混响与反射、环境模型构建、人工混响等，以提升用户体验的真实感。

回声的传播和衰减之间的交互，创造了很好的空间感，通过构建环境模型模拟用户周围的声学效果，创造高质量的音频效果，更能让用户感觉到自己身处在虚拟世界中。

结合HRTFs的衰减特性提供三维的声音消声模型，是具有较强的定向线索，但往往由于缺乏良好空间氛围导致声音听起来枯燥不真实。为了弥补这一点，我们可以添加环境模型来模拟附近的几何形状的声学效果。

HRTFs in conjunction with attenuation provide an anechoic model of three dimensional sound, which exhibits strong directional cues but tends to sound dry and artificial due to lacking room ambiance. To compensate for this, we can add environmental modeling to mimic the acoustic effects of nearby geometry.

混响与反射（Reverberation and Reflections）

随着声音穿越空间,他们从平面反射,创造一系列的回声。最初的不同回声(早期反射)帮助我们确定声音的方向和距离。随着这些回声传播,衰减,他们创建一个交互混响间隔,这有助于我们的空间感。

As sounds travel through space, they reflect off of surfaces, creating a series of echoes. The initial distinct echoes (early reflections) help us determine the direction and distance to a sound. As these echoes propagate, diminish, and interact they create alate reverberation tail, which contributes to our sense of space.

我们可以用几种不同的方法来模拟混响和反射。

We can model reverberation and reflection using several different methods.

鞋盒模型（Shoebox Model）

一些3D定位的实现层是通过HRTF建模顶部简单的“鞋盒空间”来实现的。他们是由特定的距离、六平行墙面（即鞋盒）的反射率以及听者在房间内的位置和方向组成。有了这个基本模型，你可以模拟墙壁和晚期混响特性的早期反射。

Some 3D positional implementations layer simple “shoebox room” modeling on top of their HRTF
implementation. These consist of specifying the distance and reflectivity of six parallel walls (i.e., the “shoebox”) and sometimes the listener's position and orientation within that room as well. With that basic model, you can simulate early reflections from walls and late reverberation characteristics.

虽然远远称不上完美，但总比人工的或没有混响好得多。

While far from perfect, it's much better than artificial or no reverberation.

人工混响（Artificial Reverberations）

由于墙和后期混响的物理建模会迅速的将计算量变得非常巨大。通常通过人工的，特别的方法引入混响，如上世纪80年代和90年代的使用的数字混响器。由于低于物理模型计算密度的算法和实现，特别是没有从听众的角度进行思考，导致混响听起来很不真实。

Since modeling physical walls and late reverberations can quickly become computationally expensive, reverberation is often introduced via artificial, ad hoc methods such as those used in digital reverb units of the 80s and 90s. While less computationally intensive than physical models, they may also sound unrealistic, depending on the algorithm and implementation — especially since they are unable to take the listener's orientation into account.

采样脉冲响应混响（Sampled Impulse Response Reverberation）

回旋混响是从一个具体的实际位置进行采样获得脉冲响应，如演播室、体育馆或讲堂的记录。将回旋混响应用到信号中后，导致这个信号听起来仿佛在该位置回放一样。这样会产生一些非常逼真的声音，但也存在一定的缺点。在游戏的合成环境中，匹配的采样脉冲响应很少，他们决定着一个监听器的位置和方向；他们是单声道的；他们在不同区域的过渡非常困难。

Convolution reverbs sample the impulse response from a specific real-world location such as a recording studio, stadium, or lecture hall. It can then be applied to a signal later, resulting in a signal that sounds as if it were played back in that location. This can produce some phenomenally lifelike sounds, but there are some drawbacks. Sampled impulse responses rarely match in-game synthetic environments; they represent a fixed listener position and orientation; they are monophonic; they are difficult to transition between different areas.

即使有这么多的限制，许多情况下他们仍然提供了高质量的结果。

Even with these limitations, they still provide high-quality results in many situations.

世界几何与声学（World Geometry and Acoustics）

“鞋盒模型”尝试提供一个几何形状环境的简化表示。假设没有任何遮挡情况下，与接听者头部固定距离的6个分墙，他们表面的频率的吸收都相等。不用说，这是为了性能而进行的一个巨大的简化。同时，当VR环境变得更加复杂和动态化时，该模型可能不适合使用。

The “shoebox model” attempts to provide a simplified representation of an environment's geometry. It assumes no occlusion, equal frequency absorption on all surfaces, and six parallel walls at a fixed distance from the listener's head. Needless to say, this is a heavy simplification for the sake of performance, and as VR environments become more complex and dynamic, it may not scale properly

目前，对与模拟衍射和复杂的环境几何已有一些解决方案，但是他们的支持的范围并不广泛，仍然存在明显的性能问题。

Some solutions exist today to simulate diffraction and complex environmental geometry, but support is not widespread and performance implications are still significant.

环境转换（Environmental Transitions）

特定领域的建模是复杂的，但相对而言还是简单的。然而，无论选择什么模型，都存在声音的不连续或区域之间过渡时的噪声问题。一些系统需要刷新和重新启动整个混响，以及移入其他系统实时改变参数。

Modeling a specific area is complex, but still relatively straightforward. Irrespective of choice of model, however, there is a problem of audible discontinuities or artifacts when transitioning between areas. Some systems require flushing and restarting the entire reverberator, and other systems introduce artifacts as parameters are changed in real-time.

存在和浸入（Presence and Immersion）

通过创建音频和高质量的VR视觉效果,开发商让用户感觉自己真正存在于的虚拟世界中。

By creating audio that is on par with high quality VR visuals, developers immerse the user in a true virtual world, giving them a sense of presence.

相对于从远处观看时，听众身处于场景中时音频沉浸达到最大化。例如，一个玩家低头看着虚拟棋局的3D国际象棋游戏比玩家站在游戏场中提供了更少激发兴趣的空间机会。同理，在户外活动场地，一个移动的声音信号源元素飞快经过听者的头产生的听觉远比一个听众的行动音频提示要更加逼真、形象的多。

Audio immersion is maximized when the listener is located inside the scene, as opposed to viewing it from afar. For example, a 3D chess game in which the player looks down at a virtual board offers less compelling spatialization opportunities than a game in which the player stands on the play field. By the same token, an audioscape in which moving elements whiz past the listener's head with auditory verisimilitude is far more compelling than one in which audio cues cut the listener off from the action by communicating that they're outside of the field of activity.

注：应该注意的是，现实主义的追求是值得称赞的，也是可以选择的，我们希望开发者和音效设计师保持能够继续设计具有创意性的成果。

Note: It should be noted that while the pursuit of realism is laudable, it is also optional, as we want developers and sound designers to maintain creative control over the output.