StegaNeRF: Embedding Invisible Information within Neural Radiance FieldsStegaNeRF：在神经辐射场中嵌入不可见信息

最新推荐文章于 2025-11-24 21:26:24 发布

翻译最新推荐文章于 2025-11-24 21:26:24 发布 · 61 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://ar5iv.labs.arxiv.org/html/2212.01602?_immersive_translate_auto_translate=1

文章标签：

#embedding #深度学习

StegaNeRF: Embedding Invisible Information within Neural Radiance Fields
StegaNeRF：在神经辐射场中嵌入不可见信息

Chenxin Li1, Brandon Y. Feng2∗, Zhiwen Fan3∗, Panwang Pan4, Zhangyang Wang3
1Hong Kong Polytechnic University, 2University of Maryland,
3University of Texas at Austin, 4ByteDance
李晨新 1， Brandon Y. Feng2 ∗、范志文3 ∗，潘盼旺 4，王张阳 3 香港理工大学， 2 马里兰大学， 3、德克萨斯大学奥斯汀分校，4 字节跳动
Equal contribution 平等贡献

https://ar5iv.labs.arxiv.org/html/2212.01602?_immersive_translate_auto_translate=1

Abstract 抽象

Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights. However, while common visual data (images and videos) have standard approaches to embed ownership or copyright information explicitly or subtly, the problem remains unexplored for the emerging NeRF format. We present StegaNeRF, a method for steganographic information embedding in NeRF renderings. We design an optimization framework allowing accurate hidden information extractions from images rendered by NeRF, while preserving its original visual quality. We perform experimental evaluations of our method under several potential deployment scenarios, and we further discuss the insights discovered through our analysis. StegaNeRF signifies an initial exploration into the novel problem of instilling customizable, imperceptible, and recoverable information to NeRF renderings, with minimal impact to rendered images. Project page: https://xggnet.github.io/StegaNeRF/.
神经渲染的最新进展意味着未来将通过共享 NeRF 模型权重进行广泛的视觉数据分发。然而，虽然常见的视觉数据（图像和视频）有明确或微妙地嵌入所有权或版权信息的标准方法，但对于新兴的 NeRF 格式来说，这个问题仍未得到探索。我们提出了 StegaNeRF，这是一种在 NeRF 渲染中嵌入隐写信息的方法。我们设计了一个优化框架，允许从 NeRF 渲染的图像中准确提取隐藏信息，同时保留其原始视觉质量。我们在几种潜在的部署场景下对我们的方法进行了实验评估，并进一步讨论了通过分析发现的见解。StegaNeRF 标志着对向 NeRF 渲染灌输可定制、不可察觉和可恢复信息的新问题的初步探索，同时对渲染图像的影响最小。项目页面：https://xggnet.github.io/StegaNeRF/。

1Introduction 1 介绍

Implicit neural representation (INR) is an emerging concept where the network describes the data through its weights [35, 40, 51, 57, 37]. After training, the INR weights can then be used for content distribution, streaming, and even downstream inference tasks, all without sending or storing the original data. Arguably the most prominent INR is Neural Radiance Fields (NeRF) [37], where a network learns a continuous function mapping spatial coordinates to density and color. Due to its lightweight size and superb quality, NeRF has immense potential for 3D content representation in future vision and graphics applications.
隐式神经表示（INR）是一个新兴的概念，网络通过其权重来描述数据 [35,40,51,57,37]。训练后，INR 权重可用于内容分发、流式传输，甚至下游推理任务，所有这些都无需发送或存储原始数据。可以说，最突出的 INR 是神经辐射场（NeRF） [37]，其中网络学习一个连续函数，将空间坐标映射到密度和颜色。由于其轻巧的尺寸和卓越的质量，NeRF 在未来的视觉和图形应用中具有巨大的 3D 内容表示潜力。

While there is a plethora of work dedicated towards better quality [4, 55, 62, 79], faster rendering [38, 53, 61, 50, 71], and sparse view reconstruction [39, 76, 11, 69, 73, 10], in this paper, we look beyond the horizon and explore a new question: Can we achieve steganography with NeRF?
虽然有大量工作致力于提高质量 [4,55,62,79]，更快的渲染 [38,53,61,50,71] 和稀疏视图重建 [39,76,11,69,73,10]，但在本文中，我们把目光投向地平线之外，探索了一个新问题：我们能用 NeRF 实现隐写术吗？

Refer to caption

Figure 1:We introduce the new problem of NeRF steganography: hiding information in NeRF renderings. Our proposed framework, StegaNeRF, can embed and recover customized hidden information while preserving the original NeRF rendering quality.
图 1：我们介绍了 NeRF 隐写术的新问题：在 NeRF 渲染图中隐藏信息。我们提出的框架 StegaNeRF 可以嵌入和恢复自定义的隐藏信息，同时保留原始的 NeRF 渲染质量。

Established digital steganography method [12] focus on embedding hidden messages in 2D images. The recent growth of deep learning and social media platforms further gives rise to many practical use cases of image steganography. As countless images and videos are shared online and even used to train deep learning models, 2D steganography methods [2, 3] allow users and data providers to protect copyright, embed ownership, and prevent content abuse.

Now, with the ongoing advances in 3D representations powered by NeRF, we envision a future where people frequently share their captured 3D content online just as they are currently sharing 2D images and videos online. Moreover, we are curious to explore the following research questions: ➊ Injecting information into 2D images for copyright or ownership identification is common, but can we preserve such information when people share and render 3D scenes through NeRF? ➋ NeRF can represent large-scale real-world scenes with training images taken by different people, but can we preserve these multiple source identities in the NeRF renderings to reflect the collaborative efforts required to reconstruct these 3D scenes? ➌ Common image steganography methods embed either a hidden image or a message string into a given image, but can we allow different modalities of the hidden signal in NeRF steganography?
现在，随着由 NeRF 提供支持的 3D 表示的不断进步，我们设想未来人们经常在线分享他们捕获的 3D 内容，就像他们目前在线共享 2D 图像和视频一样。此外，我们还很好奇地探索以下研究问题：➊ 将信息注入 2D 图像以进行版权或所有权识别很常见，但当人们通过 NeRF 共享和渲染 3D 场景时，我们能否保留此类信息？➋ NeRF 可以用不同人拍摄的训练图像来表示大规模的真实世界场景，但我们能否在 NeRF 渲染图中保留这些多重源身份，以反映重建这些 3D 场景所需的协作努力？➌ 常见的图像隐写方法将隐藏图像或消息字符串嵌入到给定图像中，但我们是否可以在 NeRF 隐写术中允许隐藏信号的不同模式？

Driven by these questions, we formulate a framework to embed customizable, imperceptible, and recoverable information in NeRF renderings without sacrificing the visual quality. Fig. 1 presents an overview of our proposed framework, dubbed StegaNeRF. Unlike the traditional image steganography that embeds hidden signals only into a specific source image, we wish to recover the same intended hidden signal from NeRF rendered at arbitrary viewpoints.
在这些问题的推动下，我们制定了一个框架，在不牺牲视觉质量的情况下，将可定制、不可察觉和可恢复的信息嵌入到 NeRF 渲染中。无花果。图 1 概述了我们提出的框架，称为 StegaNeRF。与仅将隐藏信号嵌入到特定源图像中的传统图像隐写术不同，我们希望从在任意视点渲染的 NeRF 中恢复相同的预期隐藏信号。

Despite many established works on 2D steganography and hidden watermarks for image and video [9, 2, 3, 23, 31, 56], naively applying 2D steganography on the NeRF training images is not practical, since the embedded information easily gets lost in the actual NeRF renderings. In contrast, our framework enables reliable extraction of the hidden signal from NeRF renderings. During NeRF training, we jointly optimize a detector network to extract hidden information from the NeRF renderings. To minimize the negative impact on the visual quality of NeRF, we identify weights with low impact on rendering and introduce a gradient masking strategy to steer the hidden steganographic information towards those low-importance weights. Extensive experimental results validate StegaNeRF balances between the rendering quality of novel views and the high-fidelity transmission of the concealed information.
尽管在图像和视频的 2D 隐写术和隐藏水印方面有许多成熟的工作 [9,2,3,23,31,56]，但天真地在 NeRF 训练图像上应用 2D 隐写术是不切实际的，因为嵌入的信息很容易在实际的 NeRF 渲染中丢失。相比之下，我们的框架可以从 NeRF 渲染中可靠地提取隐藏信号。在 NeRF 训练过程中，我们共同优化了一个检测器网络，从 NeRF 渲染中提取隐藏的信息。为了尽量减少对 NeRF 视觉质量的负面影响，我们识别了对渲染影响较小的权重，并引入了梯度掩蔽策略，将隐藏的隐写信息引导到那些低重要性的权重上。大量的实验结果验证了 StegaNeRF 在新视图的渲染质量和隐藏信息的高保真传输之间的平衡。

StegaNeRF presents the first exploration of hiding information in NeRF models for ownership identification, and our contributions can be summarized as follows:
StegaNeRF 首次探索了在 NeRF 模型中隐藏信息以进行所有权识别，我们的贡献可以总结如下：

•

We introduce the new problem of NeRF steganography and present the first effort to embed customizable, imperceptible, and recoverable information in NeRF.
我们介绍了 NeRF 隐写术的新问题，并提出了在 NeRF 中嵌入可定制、不可察觉和可恢复信息的首次努力。
•

We propose an adaptive gradient masking strategy to steer the injected hidden information towards the less important NeRF weights, balancing the objectives of steganography and rendering quality.
我们提出了一种自适应梯度掩蔽策略，将注入的隐藏信息引导到不太重要的 NeRF 权重，从而平衡隐写术和渲染质量的目标。
•

We empirically validate the proposed framework on a diverse set of 3D scenes with different camera layouts and scene characteristics, obtaining high recovery accuracy without sacrificing rendering quality.
我们在具有不同相机布局和场景特征的各种三维场景上实证验证了所提出的框架，在不牺牲渲染质量的情况下获得了高恢复精度。
•

We explore various scenarios applying StegaNeRF for ownership identification, with the additional support of multiple identities and multi-modal signals.
我们探索了应用 StegaNeRF 进行所有权识别的各种场景，并额外支持多身份和多模态信号。

2Related Work 阿拉伯数字相关工作

Neural Radiance Fields 神经辐射场

Refer to caption

Figure 2: StegaNeRF training overview. At the first stage, we optimize 𝜽𝟎 with standard NeRF training. At the second stage, we initialize 𝜽 with 𝜽𝟎 and optimize for the steganography objectives. We train the detector 𝑭𝝍 to recover hidden information from StegaNeRF renderings and no hidden information from original NeRF renderings. We introduce Classifier Guided Recovery to improve the accuracy of recovered information, and Adaptive Gradient Masking to balance between steganography ability and rendering visual quality.
图 2：StegaNeRF 培训概述。在第一阶段，我们使用标准 NeRF 训练优化 θ0subscriptθ0\bm{\theta_{0}}。在第二阶段，我们用θ0subscriptθ0\bm{\theta_{0}}初始化θθ\bm{\theta}，并针对隐写目标进行优化。我们训练检测器 FψsubscriptFψ\bm{F_{\psi}} 从 StegaNeRF 渲染中恢复隐藏信息，并且从原始 NeRF 渲染中恢复隐藏信息。我们引入了分类器引导恢复以提高恢复信息的准确性，并引入了自适应梯度掩蔽来平衡隐写能力和渲染视觉质量。

The success of NeRF [37] sparks a promising trend of improving highly photo-realistic view synthesis with regard to its rendering quality. A wide range of techniques have been incorporated into NeRF, including ray re-parameterizations [4, 5], explicit spatial data structures [72, 15, 29, 22, 53, 38], caching and distillation [59, 18, 21, 48], ray-based representations [1, 52, 14], geometric primitives [28, 30], large-scale scenes [77, 34, 68, 55], and dynamic settings [41, 45, 17, 27, 67]. Unlike prior works that make NeRF efficient or effective, this paper explores the uncharted problem of embedding information in NeRF renderings, with critical implications for copyright protection and ownership preservation. As early-stage NeRF-based products already become available [32, 43], we believe more activities based on 3D NeRFs will quickly emerge, and now is the right time to open up the problem of NeRF steganography.
NeRF [37] 的成功引发了一个有希望的趋势，即在渲染质量方面提高高度逼真的视图合成。NeRF 中采用了广泛的技术，包括射线重新参数化 [4,5]，显式空间数据结构 [72,15,29,22,53,38]，缓存和蒸馏 [59,18,21,48]，基于射线的表示 [1,52,14]，几何基元。 [28， 30]、大型场景 [77， 34， 68， 55] 和动态设置 [41， 45， 17， 27， 67]。与之前使 NeRF 高效或有效的工作不同，本文探讨了在 NeRF 渲染中嵌入信息的未知问题，对版权保护和所有权保存具有重要意义。随着基于 NeRF 的早期产品已经面世 [32,43]，我们相信更多基于 3D NeRF 的活动将迅速出现，现在是解决 NeRF 隐写术问题的最佳时机。

Image Steganography 图像隐写术

Steganography hides intended signals as invisible watermarks (e.g., hyperlinks, images) within the cover media called carriers (e.g., images, video and audio) [12, 24]. Classical methods focus on seeking and altering the least significant bits (LSBs) [16, 54, 65, 42] and transforming domain techniques [8, 60, 7, 46]. Prior research also uses deep neural networks for steganography [20, 58, 2, 3]. Among them, DeepStega [2] conceals a hidden image within an equal-sized carrier image. Subsequent works [23, 31] use invertible networks to improve the performance of deep image steganography. Another line of work conceals information in other carrier media like audio [13, 19, 70] and video [64, 33, 49]. The above advances all play a critical part in the era when traditional media formats like images and videos are dominant. However, as MLP-based neural representations of 3D scenes are gaining momentum to become a major format of visual data, extending steganography to NeRF is bound to become an important problem in the upcoming future.
隐写术将预期信号隐藏为不可见的水印（如超链接、图像）在称为载体的覆盖介质（如图像、视频和音频）中[12,24]。经典方法侧重于寻找和改变最低有效位（LSB） [16,54,65,42] 和转换域技术 [8,60,7,46]。先前的研究还使用深度神经网络进行隐写术 [20,58,2,3]。其中，DeepStega [2] 在相同大小的载流子图像中隐藏了隐藏图像。后续工作 [23,31] 使用可逆网络来提高深度图像隐写术的性能。另一行工作将信息隐藏在其他载体介质中，如音频 [13,19,70] 和视频 [64， 33， 49]。上述进步在图像和视频等传统媒体格式占主导地位的时代都发挥着至关重要的作用。然而，随着基于 MLP 的 3D 场景神经表示势头越来越大，成为视觉数据的主要格式，将隐写术扩展到 NeRF 势必成为未来一个重要问题。

Lifting Steganography to 3D
将隐写术提升到 3D

Prior to NeRF, meshes are commonly used to represent 3D shapes. Early pioneers apply steganography 3D meshes [6, 44] for copyright protection when meshes are exchanged and edited. More recent work [66] has also explored embedding multi-plane images within a JPEG-compressed image. Their problem can be regarded as a special case of 2D steganography, hiding multiple 2D images inside a single 2D image. In contrast, we try to hide a natural image into a 3D scene representation (NeRF), fundamentally differing from these prior methods where 2D images act as the carrier of hidden information.
在 NeRF 之前，网格通常用于表示 3D 形状。早期的先驱者在交换和编辑网格时应用隐写术 3D 网格 [6， 44] 来保护版权。最近的工作 [66] 还探索了在 JPEG 压缩图像中嵌入多平面图像。他们的问题可以看作是二维隐写术的一个特例，将多个二维图像隐藏在单个二维图像中。相比之下，我们试图将自然图像隐藏到 3D 场景表示（NeRF）中，这与之前的 2D 图像充当隐藏信息载体的方法有着根本的不同。

3Method 3、方法

The goal of StegaNeRF is to inject customized (steganographic) information into the NeRF weights with imperceptible visual changes when rendering. Given a NeRF with weights 𝜽𝟎 and the information I to be injected, when we render with 𝜽 on any viewpoint, we hope the injected I can be recovered by a detector Fψ with learnable weights ψ.
StegaNeRF 的目标是将定制（隐写）信息注入 NeRF 权重中，并在渲染时具有难以察觉的视觉变化。给定一个带有权重 𝜽𝟎 和要注入的信息 I 的 NeRF，当我们在任何视点上渲染时 𝜽 ，我们希望注入的内容 I 可以通过 Fψ 具有可学习权重的检测器来恢复 ψ 。

A seemingly obvious solution is to use prior image steganography methods to 1) inject I on training images, 2) train NeRF on those images with embedded information, 3) apply their provided Fψ to extract I from NeRF-rendered images. This approach has been successfully applied for GAN fingerprinting [74]. However, it fails in the NeRF context (see Fig. 3), where the subtle changes induced by prior steganography methods easily get smoothed out, inhibiting the detector from identifying the subtle patterns necessary for information recovery.
一个看似明显的解决方案是使用先前的图像隐写方法来 1）注入 I 训练图像，2）在那些具有嵌入信息的图像上训练 NeRF，3）应用它们提供的 Fψ 从 NeRF 渲染的图像 I 中提取。这种方法已成功应用于 GAN 指纹识别 [74]。然而，它在 NeRF 环境中失败了（见图 3），在 NeRF 环境中，先前隐写方法引起的细微变化很容易被平滑，从而抑制检测器识别信息恢复所需的细微模式。

Such shortcomings of off-the-shelf steganographic methods are not surprising since they are developed for the traditional setting where 2D images are the ultimate form of visual information. In contrast, our problem setting involves the new concept of INR as the underlying representation, and 2D images are just the final output rendered by NeRF.
现成的隐写方法的这些缺点并不奇怪，因为它们是为传统环境而开发的，其中 2D 图像是视觉信息的终极形式。相比之下，我们的问题设置涉及 INR 作为底层表示的新概念，2D 图像只是 NeRF 渲染的最终输出。

3.1Two-Stage Optimization
3.1 两阶段优化

Recognizing the formulation difference due to the emergence of implicit representation with NeRF, we move away from the traditional 2D image-based pipeline that trains an encoder network to inject subtle changes to the given 2D images. Instead, we incorporate the steganography objective into the gradient-based learning process of NeRF.
认识到由于 NeRF 隐式表示的出现而导致的公式差异，我们摆脱了传统的基于 2D 图像的管道，该管道训练编码器网络向给定的 2D 图像注入细微的变化。相反，我们将隐写术目标纳入 NeRF 基于梯度的学习过程中。

We re-design the training as a two-stage optimization. The first stage is the original NeRF training procedure, involving the standard photometric loss between the rendered and ground truth pixels to guarantee the visual quality. After finishing the weights update 𝜽𝟎 at the first stage, we dedicate the second stage to obtain the final NeRF weights 𝜽 containing steganographic information. We introduce several techniques to achieve robust information recovery with imperceptible impact on the rendered images. The training workflow is depicted in Fig. 2 and Alg. 1.
我们将训练重新设计为两阶段优化。第一阶段是原始的 NeRF 训练程序，涉及渲染像素和地面实况像素之间的标准光度损耗，以保证视觉质量。在完成第一阶段的权重更新 𝜽𝟎 后，我们将第二阶段用于获得 𝜽 包含隐写信息的最终 NeRF 权重。我们介绍了几种技术来实现稳健的信息恢复，对渲染图像的影响是难以察觉的。训练工作流程如图 2 和 Alg. 1 所示。

3.2Information Recovery
3.2 信息恢复

Let P denote the camera pose at which we render an image from a NeRF network. We want to recover I from the rendered image 𝜽(P). Importantly, we also want to avoid false positives on images 𝜽0(P) rendered by the original network without steganography ability, even if the images rendered by 𝜽0 and 𝜽 look visually identical. Therefore, we optimize 𝜽 to minimize the following contrastive loss terms:
让我们 P 表示我们从 NeRF 网络渲染图像的相机姿势。我们想从渲染的图像 I 𝜽(P) 中恢复 .重要的是，我们还希望避免原始网络在没有隐写能力的情况下渲染的图像 𝜽0(P) 出现误报，即使原始网络渲染 𝜽0 的图像在视觉上 𝜽 看起来相同。因此，我们进行优化 𝜽 以最小化以下对比损失项：

ℒdec+=|Fψ(𝜽(P))−I|,ℒdec−=|Fψ(𝜽0(P))−∅|,

(1)

where ∅ is an empty image with all zeros.
其中 ∅ 是全为零的空图像。

Refer to caption

Figure 3: Results of applying prior 2D steganography methods, LSB [9] (top) and DeepStega [2] (bottom). From left to right, we show (a) training image after applying 2D steganography, (b) residual error of (a) over ground truth, (c) hidden image recovered from (a), (d) residual error of NeRF rendering at the pose of (a), and (e) hidden image recovered from (d). Prior 2D steganography methods fail in the NeRF context since the hidden information injected in training images mostly disappear in NeRF renderings.
图 3：应用先前的二维隐写方法 LSB [9]（上）和 DeepStega [2]（下）的结果。从左到右，我们显示了（a）应用 2D 隐写术后的训练图像，（b）（a）地面实况的残差，（c）从（a）恢复的隐藏图像，（d）NeRF 渲染在（a）位置的残差，以及（e）从（d）恢复的隐藏图像。以前的 2D 隐写术方法在 NeRF 上下文中失败，因为注入训练图像中的隐藏信息大多在 NeRF 渲染中消失。

Classifier-Guided Recovery
分类器引导的恢复

The detector Fψ is easily implemented as a U-Net to decode I as the form of 2D images, but accurately recovering all the details in I might be challenging. Therefore, we additionally train a classifier network Fψc to solve the easier task of classifying whether the given NeRF rendering contains hidden information. Fψc is optimized by the following cross-entropy loss:
该探测器 Fψ 很容易作为 U-Net 实现，以解码 I 为 2D 图像的形式，但准确恢复中 I 的所有细节可能具有挑战性。因此，我们额外训练一个分类器网络 Fψc 来解决对给定的 NeRF 渲染是否包含隐藏信息进行分类的更简单任务。 Fψc 通过以下交叉熵损失进行优化：

ℒdecc=−log⁡(Fψc(𝜽(P)))−log⁡(1−Fψc(𝜽0(P))).

(2)

We then use its prediction to guide the process of decoding pixel-wise information by adding the classification output as input to Fψ, such that Fψ(x)=Fψ(x,Fψc(x)).
然后，我们使用其预测来指导解码像素信息的过程，方法是将分类输出作为输入添加到 Fψ 中， Fψ(x)=Fψ(x,Fψc(x)) 使得。

Although the above discussion focuses on hiding images, our framework can be easily extended to embed other modalities like strings, text, or even audio, all of which may be represented as 1D vectors. We can simply modify the architecture of Fψ to have a 1D prediction branch.
尽管上述讨论的重点是隐藏图像，但我们的框架可以轻松扩展以嵌入其他模态，如字符串、文本甚至音频，所有这些都可以表示为 1D 矢量。我们可以简单地修改的 Fψ 架构，以拥有一个一维预测分支。

Algorithm 1 Train StegaNeRF on a single scene
算法 1 在单个场景上训练 StegaNeRF

Data: Training images {Yi} with poses {Pi}, hidden information I, learning rate η=[η0,η1]
Output: Steganographic NeRF 𝜽 and detector Fψ
Initialize NeRF 𝜽0 and detector network Fψ
Optimize 𝜽0 on {Yi,Pi} with standard NeRF training
Compute mask 𝒎 for 𝜽𝟎 as in Eq. (3)
for each training iteration t do

Randomly sample a training pose Pi and render 𝜽(Pi)

Compute stega. losses ℒdecc, ℒdec+, ℒdec− as Eq. ( 1), ( 2)

Compute standard loss ℒrgb in Eq. ( 4)

Combine total loss ℒ as in Eq. ( 5)

Update 𝜽 with η0⋅(∂ℒ∂𝜽⊙𝒎) and Fψ with η1⋅∂ℒ∂ψ

end for

3.3Preserving Perceptual Identity
3.3 保留感知身份

Since we want to hide information without affecting the visual perception of the rendered output, an intuitive regularization is to penalize how much 𝜽 deviates from 𝜽𝟎. However, we find that naively summing the deviation penalty across all weights makes it difficult for the NeRF network to adjust its weights for the steganographic objective. Instead, motivated by the fact that INR weights are not equally important and exhibit strong sparsity [26, 75], we propose an adaptive gradient masking strategy to encode the hidden information on specific weights’ groups.
由于我们希望在不影响渲染输出的视觉感知的情况下隐藏信息，因此直观的正则化是惩罚偏离 𝜽𝟎 的程度 𝜽 。然而，我们发现，对所有权重的偏差惩罚进行朴素求和会使 NeRF 网络难以针对隐写目标调整其权重。相反，由于 INR 权重并不同样重要，并且表现出很强的稀疏性 [26,75]，我们提出了一种自适应梯度掩蔽策略来编码特定权重组的隐藏信息。

Formally, given the initial weights 𝜽𝟎∈ℝN, we obtain the importance of weights 𝒘 and a mask 𝒎∈ℝN as
形式上，给定初始权重 𝜽𝟎∈ℝN ，我们得到权重 𝒘 和掩码的重要性 𝒎∈ℝN 为

𝒘=|𝜽𝟎|α,𝒎=𝒘−1∑iN𝒘i−1,

(3)

where α>0 controls the relative distribution of importance across the weights. We mask the gradient as ∂ℒ∂𝜽⊙𝒎 when optimizing 𝜽 based on the total loss ℒ in the second stage, where ⊙ is a Hadamard product. Effectively, more significant weights are “masked out” to minimize the impact of steganographic learning on the rendered visual quality.
其中 α>0 控制权重之间重要性的相对分布。我们屏蔽梯度，就像 𝜽 在第二阶段根据总损失 ℒ 进行优化时一样 ∂ℒ∂𝜽⊙𝒎 ，其中 ⊙ 是 Hadamard 乘积。实际上，更重要的权重被“掩盖”，以最大限度地减少隐写学习对渲染视觉质量的影响。

We retain the photometric error of the vanilla NeRF [37] formulation in the steganography learning to prevent NeRF from deviating from its rendered visual signal fidelity:
我们在隐写术学习中保留了普通 NeRF [37] 公式的光度误差，以防止 NeRF 偏离其渲染的视觉信号保真度：

ℒrgb=|𝜽(P)−𝜽0(P)|,

(4)

The overall training loss at the second stage can be formulated as follows:
第二阶段的总体训练损失可以表述如下：

ℒ=λ0ℒdecc+λ1ℒdec++λ2ℒdec−+λ3ℒrgb.

(5)

Algorithm 2 Typical usage scenario of StegaNeRF
算法 2 StegaNeRF 的典型使用场景

1: Alice captures some images of a 3D scene
1：Alice 捕获了一些 3D 场景的图像

2: Alice trains a StegaNeRF to hide a personalized image
阿拉伯数字：Alice 训练 StegaNeRF 隐藏个性化图像

3: Alice shares the model 𝜽 online for other people to enjoy and explore the 3D scene themselves
3：爱丽丝在线分享模型 𝜽 ，供其他人自己欣赏和探索 3D 场景

4: Bob grabs the model 𝜽 and reposts it with his own account without crediting Alice or asking for permission
4：鲍勃抓住模特 𝜽 并用自己的帐户转发，没有注明爱丽丝或征求许可

5: Alice sees Bob’s post, deploys the detector Fψ, and verifies the owner of 𝜽 is Alice, not Bob
5： Alice 看到 Bob 的帖子，部署了检测器 Fψ ，并验证了的 𝜽 所有者是 Alice，而不是 Bob

6: Bob takes down the post or gets banned for copyright infringement
6：鲍勃删除该帖子或因侵犯版权而被禁止

4Experiments 4、实验

In this section, we present experimental evaluations under several use case scenarios. We further provide additional analysis on the impact of each proposed technique and the robustness analysis of the overall framework.
在本节中，我们将介绍几种用例场景下的实验评估。我们进一步对每种拟议技术的影响以及整体框架的鲁棒性分析进行了额外的分析。

4.1Implementation Details.
4.1 实现详细信息。

Dataset. We use common datasets LLFF [36] and NeRF-Synthetic [37], with forward scenes {flower, fern, fortress, room} from LLFF and 360∘ scenes {lego, drums, chair} from NeRF-Synthetic. We further experiment on the Brandenburg Gate scene from NeRF-W dataset [34], with over 800 views of in-the-wild collected online.
数据。 我们使用常见的数据集 LLFF [36] 和 NeRF-Synthetic [37]，其中 LLFF 的前向场景{ 花、蕨类植物、堡垒、房间 }和 360∘ NeRF-Synthetic 的场景{ 乐高、鼓、椅子 }。我们进一步从 NeRF-W 数据集 [34] 对勃兰登堡门场景进行了实验，在线收集了 800 多个野外视图。

Training. On LLFF and NeRF-Synthetic, we adopt Plenoxels [71] as the NeRF backbone architecture for efficiency. On NeRF-W, we use the available PyTorch implementation [47]. The first stage of training is performed according to the standard recipes of those implementations. We then perform the second stage of steganography training for 55 epochs unless otherwise noted. On LLFF and NeRF-W, we downsize the training images by 4 times following common practice, and we use the original size on NeRF-Synthetic. For hyper-parameters in Eq. (5), we set the weight of NeRF reconstruction error λ3=1 for all experiments. We set λ0=0.01,λ1=0.5,λ2=0.5 for all scenes on LLFF dataset, and λ0=0.1,λ1=1,λ2=1 for the scenes in NeRF-Synthetic, and λ0=0.05,λ1=1,λ2=1 for NeRF-W. For Eq. (3), we set α=3 for all scenes. We run experiments on one NVIDIA A100 GPU.
训练。 在 LLFF 和 NeRF-Synthetic 上，我们采用 Plenoxels [71] 作为 NeRF 主干架构以提高效率。在 NeRF-W 上，我们使用了可用的 PyTorch 实现 [47]。训练的第一阶段是根据这些实现的标准配方进行的。然后，除非另有说明，否则我们会进行 55 个时期的第二阶段隐写术训练。在 LLFF 和 NeRF-W 上，我们按照惯例将训练图像缩小了 4 倍，并在 NeRF-Synthetic 上使用原始尺寸。对于公式（5）中的超参数，我们设置了所有实验的 NeRF 重建误差 λ3=1 的权重。我们 λ0=0.01,λ1=0.5,λ2=0.5 为 LLFF 数据集上的所有场景、 λ0=0.1,λ1=1,λ2=1 NeRF-Synthetic 中的场景和 NeRF-W λ0=0.05,λ1=1,λ2=1 设置了场景。对于式（3），我们 α=3 设置了所有场景。我们在一个 NVIDIA A100 GPU 上运行实验。

Table 1: Quantitative results of StegaNeRF rendering and hidden information recovery. Standard NeRF is the initial NeRF 𝜽𝟎 with standard training, serving as an upper-bound performance for NeRF rendering. Prior 2D steganography fails after NeRF training while StegaNeRF successfully embeds and recovers hidden information with minimal impact on the rendering quality. Results are averaged over the selected LLFF and NeRF-Synthetic scenes.
表 1： StegaNeRF 渲染和隐藏信息恢复的定量结果。标准 NeRF 是具有标准训练的初始 NeRF θ0subscriptθ0\bm{\theta_{0}}，作为 NeRF 渲染的上限性能。先前的 2D 隐写术在 NeRF 训练后失败，而 StegaNeRF 成功嵌入和恢复隐藏信息，对渲染质量的影响最小。结果在选定的 LLFF 和 NeRF 合成场景上取平均值。

Method 方法	NeRF Rendering NeRF 渲染			Hidden Recovery 隐藏恢复
Method 方法	PSNR ↑ PSNR ↑	SSIM ↑ 是的 ↑	LPIPS ↓ LPIPS 的 ↓	Acc. (%) ↑ 应变（%） ↑	SSIM ↑ 是的 ↑
Standard NeRF 标准 NeRF	27.74	0.8353	0.1408	50.00	N/A 不适用
LSB [9]	27.72	0.8346	0.1420	N/A 不适用	0.0132
DeepStega [2] 深 Stega [2]	26.55	0.8213	0.1605	N/A 不适用	0.2098
StegaNeRF (Ours) StegaNeRF（熊）	27.72	0.8340	0.1428	100.0	0.9730
Standard NeRF 标准 NeRF	31.13	0.9606	0.0310	50.00	N/A 不适用
LSB [9]	31.12	0.9604	0.0310	N/A 不适用	0.0830
DeepStega [2] 深 Stega [2]	31.13	0.9606	0.0313	N/A 不适用	0.2440
StegaNeRF (Ours) StegaNeRF（熊）	30.96	0.9583	0.0290	99.72	0.9677

Refer to caption

Figure 4:Results on the NeRF-Synthetic dataset. Within each column, we show the StegaNeRF rendering, residual error from the initial NeRF rendering, and the recovered hidden image. We show the SSIM for the StegaNeRF renderings and the recovered hidden images.
图 4：NeRF-Synthetic 数据集的结果。在每一列中，我们显示了 StegaNeRF 渲染、初始 NeRF 渲染的残差以及恢复的隐藏图像。我们展示了 StegaNeRF 渲染图的 SSIM 和恢复的隐藏图像。

Evaluation. We evaluate our system based on a typical authentication scenario shown in Alg. 2. We consider the recovery quality of the embedded information, including the metrics of classification accuracy (Acc.) of the classifier, and the structural similarity (SSIM) [63] of the hidden image recovered by the detector. We evaluate the final NeRF renderings with PSNR, SSIM and LPIPS [78]. All the metrics are computed on the test set and averaged over all the scenes and embedded images. Per-scene details are provided in the supplement.
评估。 我们根据 Alg. 2 中所示的典型身份验证场景评估我们的系统。我们考虑了嵌入信息的恢复质量，包括分类器的分类准确度（Acc.）指标，以及检测器恢复的隐藏图像的结构相似性（SSIM） [63]。我们用 PSNR、SSIM 和 LPIPS 评估最终的 NeRF 渲染图 [78]。所有指标都是在测试集上计算的，并在所有场景和嵌入图像上取平均值。补充中提供了每个场景的详细信息。

4.2Case I: Embedding in a Single Scene.
4.2 案例一：嵌入单个场景。

We first explore the preliminary case of ownership identification on a specific NeRF scene. We select random images from ImageNet [25] as the hidden information to be injected to NeRF renderings.
我们首先探讨了特定 NeRF 场景上所有权识别的初步案例。我们从 ImageNet [25] 中随机选择图像作为隐藏信息注入 NeRF 渲染图。

Failure of 2D Baseline
2D 基线失败

Due to the lack of prior study on NeRF steganography, we consider a baseline from 2D image steganography by training NeRF from scratch with the watermarked images. We implement two off-the-shelf steganography methods including a traditional machine learning approach called Least Significant Bit (LSB [9]), and a deep learning pipeline as DeepStega [2]. An ideal case is that the embedded information can be recovered from the synthesized novel views, indicating the successful NeRF steganography. However, as can be seen in Fig. 3, the embedded information containing the hidden images cannot be recovered from the NeRF renderings. By analyzing the residual maps of training views (between GT training views and the watermarked) and novel views (between GT novel views and the actual rendering), we observe the subtle residuals to recover the hidden information are smoothed out in NeRF renderings, and similar failures occur on other datasets as well. Therefore, for the rest of our study, we mainly focus on analyzing our new framework that performs direct steganography on 3D NeRF.
由于缺乏对 NeRF 隐写术的先前研究，我们通过使用带水印的图像从头开始训练 NeRF 来考虑 2D 图像隐写术的基线。我们实现了两种现成的隐写方法，包括一种称为最低有效位（LSB）的传统机器学习方法[9]，以及一种称为 DeepStega [2] 的深度学习管道。一个理想的情况是可以从合成的新视图中恢复嵌入的信息，这表明 NeRF 隐写术是成功的。然而，如图 3 所示，包含隐藏图像的嵌入信息无法从 NeRF 渲染中恢复。通过分析训练视图（GT 训练视图和水印视图之间）和新视图（GT 新视图和实际渲染之间）的残差图，我们观察到 NeRF 渲染中恢复隐藏信息的细微残差被平滑，其他数据集也发生了类似的故障。因此，在我们研究的其余部分，我们主要专注于分析对 3D NeRF 进行直接隐写术的新框架。

Table 2: Quantitative results of NeRF rendering and hidden information recovery. We consider two conditions, embedding each scene with a common hidden image (One-for-All) or a scene-specific hidden image (One-for-All). We report the quality difference compared to the single-scene settings as ΔSSIM(10−2). Results are averaged over the selected LLFF scenes.
表 2： NeRF 渲染和隐藏信息恢复的定量结果。我们考虑两个条件，即在每个场景中嵌入一个通用的隐藏图像（One-for-All）或特定于场景的隐藏图像（One-for-All）。我们将与单场景设置相比的质量差异报告为 ΔSSIM（10−2）上标下标ΔSSIM 上标 102\Delta_{SSIM}^{（{10}^{-2}）}。结果在选定的 LLFF 场景上取平均值。

Setting 设置	NeRF Rendering NeRF 渲染				Hidden Recovery 隐藏恢复
Setting 设置	PSNR↑	SSIM↑ 是的 ↑	LPIPS ↓ LPIPS 的 ↓	ΔSSIM(10−2)↑	Acc. (%)↑ 应变（%） ↑	SSIM↑ 是的 ↑	ΔSSIM(10−2)↑
One-for-All 一对一	24.99	0.8013	0.1786	-0.10	100.00	0.9860	+0.19
One-for-One 一换一	24.99	0.8016	0.1779	-0.07	100.00	0.9122	-7.19
One-for-All 一对一	27.90	0.8513	0.1236	+0.01	100.00	0.9844	-0.76
One-for-One 一换一	27.90	0.8515	0.1195	+0.03	100.00	0.9448	-4.45
One-for-All 一对一	30.27	0.8498	0.1289	+0.02	100.00	0.9430	-5.42
One-for-One 一换一	30.12	0.8480	0.1302	-0.16	100.00	0.9102	-8.67

Refer to caption

Figure 5: Results on three multi-scene settings. Within each column, we show the StegaNeRF rendering, residual error from the initial NeRF rendering, and recovered hidden image. We show the SSIM for the StegaNeRF renderings and the recovered hidden images.
图 5：三种多场景设置的结果。在每一列中，我们显示了 StegaNeRF 渲染、初始 NeRF 渲染的残差以及恢复的隐藏图像。我们展示了 StegaNeRF 渲染图的 SSIM 和恢复的隐藏图像。

Results 结果

Tab. 1 contains quantitative results on LLFF and NeRF-Synthetic scenes. While NeRF trained by 2D steganography methods hardly recovers the embedded information, StegaNeRF accurately recovers the hidden image with minimal impact on the rendering quality measured by PSNR. Fig. 4 provides qualitative results of StegaNeRF on three embedded images on the NeRF-Synthetic scenes. An interesting observation is the common regions where hidden information (high residual error) emerges in the renderings, e.g., the right rear side of lego and left handrail of chair. We also notice that, within each scene, these regions are persistent across multiple viewpoints.
表 1 包含 LLFF 和 NeRF-Synsetic 场景的定量结果。虽然通过二维隐写术方法训练的 NeRF 几乎无法恢复嵌入的信息，但 StegaNeRF 可以准确地恢复隐藏的图像，而对 PSNR 测量的渲染质量的影响最小。图 4 提供了 StegaNeRF 在 NeRF-Synsetic 场景上三个嵌入图像上的定性结果。一个有趣的观察结果是效果图中出现隐藏信息（高残差）的常见区域，例如乐高的右后侧和椅子的左侧扶手。我们还注意到，在每个场景中，这些区域在多个视点上都是持久的。

4.3Case II: Embedding in Multiple Scenes.
4.3 案例二：嵌入多个场景。

Settings 设置

We extend our steganography scheme to embed information within multiple scenes at once. Specially, we use three LLFF scenes {flower, fern, fortress} to test the two sub-settings of multi-scene embedding with: (1) One-for-All, a common hidden image and (2) One-for-One scene-specific hidden images. All scenes share the same detector Fψ and classifier Fψc. The difference between the two sub-settings is the number of hidden images that Fψ and Fψc need to identify and recover. We sample one scene for every training epoch, and due to the increased data amount, we increase training epochs until convergence.
我们扩展了隐写术方案，以同时在多个场景中嵌入信息。具体来说，我们使用三个 LLFF 场景{ 花、蕨类植物、堡垒 }来测试多场景嵌入的两个子设置：（1）One-for-All，一个常见的隐藏图像和（2）One-for-One 特定场景的隐藏图像。所有场景共享相同的检测器和 Fψ 分类器 Fψc 。两个子设置之间的区别在于需要识别和恢复的 Fψ Fψc 隐藏图像的数量。我们为每个训练时期采样一个场景，由于数据量增加，我们增加了训练时期，直到收敛。

Results 结果

Tab. 2 provides quantitative results on the multi-scene setting. The performance drop compared to single-scene training is sometimes noticeable, but it is not surprising due to the inherent requirement of per-scene fitting for our NeRF framework. Fig. 5 presents visual comparisons of multi-scene steganography against single-scene scheme.
表 2 提供了多场景设置的定量结果。与单场景训练相比，性能下降有时是显而易见的，但由于我们的 NeRF 框架固有的每个场景拟合要求，这并不奇怪。图 5 显示了多场景隐写术与单场景方案的视觉比较。

Refer to caption

Figure 6: Qualitative results of NeRF steganography on NeRF-W dataset. Within each block we show the StegaNeRF rendering, the initial NeRF rendering, residual error between StegaNeRF and initial, and the recovered hidden image. At bottom right, we provide the SSIM computed with their respective ground truth. “Unmarked Identity” denotes the remaining training views not related to any of the considered users, and the detector does not recover meaningful information from renderings at those poses.
图 6： NeRF-W 数据集上 NeRF 隐写术的定性结果。在每个块中，我们展示了 StegaNeRF 渲染、初始 NeRF 渲染、StegaNeRF 和初始之间的残差以及恢复的隐藏图像。在右下角，我们提供了计算的 SSIM 及其各自的基本事实。“未标记身份”表示与任何考虑的用户无关的剩余训练视图，并且检测器不会从这些姿势的渲染中恢复有意义的信息。

Refer to caption

Figure 7: Multi-modal steganographic hidden information. Each block contains the StegaNeRF rendering, recovered hidden images, audio (shown in waveform and spectrum), and text. The recovery metrics for hidden information in each modality are labeled respectively. For the StegaNeRF rendering, we report both SSIM and the relative change from initial NeRF renderings.
图 7：多模态隐写隐藏信息。每个块都包含 StegaNeRF 渲染、恢复的隐藏图像、音频（以波形和频谱显示）和文本。每种模态中隐藏信息的恢复指标分别进行标记。对于 StegaNeRF 渲染，我们报告了 SSIM 和初始 NeRF 渲染的相对变化。

4.4Case III: Embedding Multiple Identities.
4.4 案例三：嵌入多个身份。

Settings 设置

Constructing NeRF of large-scale cultural landmarks is a promising application, and community contributions are crucial to form the training images. Since every NeRF prediction is indebted to some particular training images, it would be meaningful to somehow identify the contributing users in the rendered output. Specifically, we present a proof-of-concept based on the following scenario: Given a collection of user-uploaded training images, our task is to apply StegaNeRF to ensure the final NeRF renderings hide subtle information about the relevant identities whose uploaded images help generate the current rendering.
构建大型文化地标的 NeRF 是一个有前景的应用，社区的贡献对于形成训练图像至关重要。由于每个 NeRF 预测都归功于某些特定的训练图像，因此以某种方式识别渲染输出中的贡献用户将是有意义的。具体来说，我们提出了一个基于以下场景的概念验证：给定用户上传的训练图像的集合，我们的任务是应用 StegaNeRF 来确保最终的 NeRF 渲染隐藏有关相关身份的微妙信息，这些身份上传的图像有助于生成当前渲染。

To simulate this multi-user scenario in the public NeRF-W dataset, we randomly select M anchor views with different viewing positions, and then find the K nearest neighbour views to each anchor to form their respective clusters. We set M=3,K=20 in experiments. We assume a common contributor identity for each cluster, and we want the NeRF rendering to contain customized hidden information about those M contributors when we render within the spatial regions spanned by their own cluster. Thus, our classifier network Fϕc is modified to output M-class cluster predictions and another class for identity outside of the M clusters. Since the detector should extract no information for views outside of those M clusters to prevent false positives, we also compute ℒdec− (1) and ℒdecc (2) for those poses.
为了在公共 NeRF-W 数据集中模拟这种多用户场景，我们随机选择 M 具有不同观看位置的锚点视图，然后找到 K 每个锚点最近的邻视图，形成各自的聚类。我们开始 M=3,K=20 了实验。我们假设每个集群都有一个共同的贡献者身份，并且我们希望 NeRF 渲染在我们自己的集群跨越的空间区域内渲染时包含有关这些 M 贡献者的自定义隐藏信息。因此，我们的分类器网络 Fϕc 被修改为输出 M 类聚类预测和另一个类用于 M 聚类之外的身份。由于检测器不应为这些 M 聚类之外的视图提取任何信息以防止误报，因此我们还计算 ℒdec− 这些位姿的（1）和 ℒdecc （2）。

Results 结果

We employ the same network backbone as NeRF-W [58] to handle in-the-wild images with different time and lighting effects. Fig. 6 presents qualitative results of embedding multiple identities in a collaborative large-scale NeRF.
我们采用与 NeRF-W 相同的网络主干网[58] 来处理具有不同时间和光照效果的野外图像。图 6 显示了在协作大规模 NeRF 中嵌入多个身份的定性结果。

Refer to caption

Figure 8: Impact on visual quality when changing different components of the proposed StegaNeRF framework as in Tab. 3. We show the SSIM for the renderings and the recovered hidden images.
图 8：更改所提出的 StegaNeRF 框架的不同组件时对视觉质量的影响，如表 3 所示。我们显示了渲染图和恢复的隐藏图像的 SSIM。

4.5Case IV: Embedding Multi-Modal Information.
4.5 案例四：嵌入多模态信息。

Settings 设置

We further show the potential of StegaNeRF in embedding multi-modal hidden information, such as images, audio, and text. We modify the detector network to build a modal-specific detector for each modality.
我们进一步展示了 StegaNeRF 在嵌入多模态隐藏信息（如图像、音频和文本）方面的潜力。我们修改检测器网络，为每种模态构建一个特定于模态的检测器。

Results 结果

Fig. 7 shows recovered multi-modal embedded signals in the trex and horns scenes from LLFF. Evidently, the StegaNeRF framework can easily extend to embed multi-modal information with high recovery performance without scarifying the rendering quality.
无花果。图 7 显示了从 LLFF 恢复的 trex 和喇叭场景中的多模态嵌入信号。显然，StegaNeRF 框架可以轻松扩展以嵌入具有高恢复性能的多模态信息，而不会影响渲染质量。

Table 3:Ablation study of different components of StegaNeRF. Results are averaged on the selected LLFF scenes.
表 3：StegaNeRF 不同成分的消融研究。结果在选定的 LLFF 场景上取平均值。

Method 方法	NeRF Rendering NeRF 渲染			Hidden Recovery 隐藏恢复
Method 方法	PSNR↑	SSIM↑ 是的 ↑	LPIPS↓ LPIPS 的 ↓	Acc. (%)↑ 应变（%） ↑	SSIM↑ 是的 ↑
StegaNeRF StegaNeRF 的 StegaNeRF	28.21	0.8580	0.1450	100.0	0.9224
No Classifier 无分类器	26.85	0.8077	0.2417	N/A 不适用	0.4417
No Classifier Guided 无分类器引导	27.12	0.8239	0.2073	100.0	0.5461
No Gradient Masking 无渐变遮罩	27.86	0.8375	0.1710	100.0	0.8822
No Soft Masking 无软遮罩	28.05	0.8558	0.1526	94.44	0.8751
Standard NeRF 标准 NeRF	28.23	0.8593	0.1440	50.00	N/A 不适用

4.6Further Empirical Analysis
4.6 进一步的实证分析

Ablation Studies 消融研究

The effect of removing each component in the StegaNeRF framework is presented in Tab. 3. Standard NeRF uses the initial results of the standard NeRF, serving as an upper-bound rendering performance. No Classifier completely removes the classifier Fψc, while No Classifier Guided retains the classification task (hence impact on NeRF rendering) but does not condition the detector on classifier output. No Gradient Masking removes the proposed masking strategy (Sec. 3.3), and No Soft Masking uses the binary mask with a threshold of 0.5. Our StegaNeRF makes a good balance between the rendering quality and the decoding accuracy. Fig. 8 further presents the visual impact of removing each component.
表 3 介绍了删除 StegaNeRF 框架中每个组件的效果。 标准 NeRF 使用标准 NeRF 的初始结果，作为上限渲染性能。 No Classifier 完全删除分类器 Fψc ，而 No Classifier Guided 保留分类任务（因此会影响 NeRF 渲染），但不会根据分类器输出来调节检测器。 无渐变掩蔽删除了建议的掩蔽策略（第 3.3 节），而无软掩蔽使用阈值为 0.5 的二进制掩码。我们的 StegaNeRF 在渲染质量和解码精度之间取得了良好的平衡。图 8 进一步给出了去除各组分的视觉效果。

Refer to caption

(a)

Refer to caption

(b)

Figure 9: Analysis of gradient masking. (a) Varying α from Eq. (4). (b) Masking different ratios of weights from the gradient updates at steganography learning. We provide the SSIM of rendered views (blue) and recovered hidden images (green).
图 9：梯度掩蔽分析。（a）与式（4）不同的αα\alpha。（b）掩盖隐写学习中梯度更新的不同权重比例。我们提供渲染视图（蓝色）和恢复的隐藏图像（绿色）的 SSIM。

Refer to caption

(a)

Refer to caption

(b)

Figure 10: Analysis of robustness over (a) JPEG compression and (b) Gaussian blur. We provide the SSIM of rendered views (blue) and recovered hidden images (green).
图 10：（a） JPEG 压缩和（b）高斯模糊的鲁棒性分析。我们提供渲染视图（蓝色）和恢复的隐藏图像（绿色）的 SSIM。

Sensitivity Analysis 敏感性分析

Fig. 9 shows the impact of varying gradient masking strategies. Fig. 10 reports the performance of StegaNeRF against the common perturbations including JPEG compression and Gaussian noise. The lines show the average accuracies across selected scenes and the shaded regions indicate the range of 0.5 standard deviation. The hidden recovery of StegaNeRF is robust to various JPEG compression ratios and Gaussian blur degradation.
无花果。图 9 显示了不同梯度掩蔽策略的影响。图 10 报告了 StegaNeRF 在 JPEG 压缩和高斯噪声等常见扰动下的性能。这些线条显示所选场景的平均精度，阴影区域表示 0.5 标准差的范围。StegaNeRF 的隐藏恢复对各种 JPEG 压缩比和高斯模糊退化具有鲁棒性。

5Discussion 5、讨论

In this section, we provide further discussion on the significance of the proposed framework and presented results.
在本节中，我们将进一步讨论所提出的框架的重要性和呈现的结果。

Why 2D Steganography fails in NeRF?
为什么 2D 隐写术在 NeRF 中失败？

The changes to the NeRF training images induced by 2D steganographic methods are hard to be retained in NeRF renderings. This is not surprising as NeRF tends to smooth out the subtle details from training images. In contrast, our method directly optimizes the NeRF weights so that its rendering contains subtle details that the detector network can identify. See the supplement for a more detailed analysis.
二维隐写法引起的 NeRF 训练图像的变化很难在 NeRF 渲染中保留。这并不奇怪，因为 NeRF 倾向于平滑训练图像中的细微细节。相比之下，我们的方法直接优化了 NeRF 权重，使其渲染包含检测器网络可以识别的细微细节。有关更详细的分析，请参阅补充文件。

How useful is steganography for NeRF?
隐写术对 NeRF 有多大用处？

Although NeRF-based 3D content has yet to become mainstream, we believe it will play a major future role not only for social platforms, but also for 3D vision research and applications. On the one hand, when people upload their personal NeRF models online for viewing purposes, NeRF steganography for ownership identification is apparently an important feature. On the other hand, future 3D vision research will likely demand large-scale datasets with NeRF models trained with real-world images, and in this context, NeRF steganography can be a crucial tool for responsible and ethical uses of training data and deployed models.
尽管基于 NeRF 的 3D 内容尚未成为主流，但我们相信它不仅在社交平台上，而且在 3D 视觉研究和应用中都将发挥重要作用。一方面，当人们将个人 NeRF 模型上传到网上进行查看时，用于所有权识别的 NeRF 隐写术显然是一个重要功能。另一方面，未来的 3D 视觉研究可能需要使用真实世界图像训练的 NeRF 模型的大规模数据集，在这种情况下，NeRF 隐写术可以成为负责任和合乎道德地使用训练数据和部署模型的重要工具