Intriguing properties of neural networks

本文探讨了神经网络的高级层中语义信息存在于空间而非单个单元的现象,以及对抗样本的鲁棒性。研究发现,对正确分类的输入添加微小扰动可生成对抗样本,这些样本在不同网络结构中仍能引发错误分类,揭示了深度学习模型的非直观特性和内在盲点。此外,对抗样本的生成挑战了模型的局部泛化能力,表明深度神经网络的平滑假设并不总是成立。

作者: 19届 lz

论文:《Intriguing properties of neural networks》



特性:

根据单元分析的各种方法,我们发现单个高级单元和高级单元的随机线性组合之间没有区别。它表明,在神经网络的高层中,包含语义信息的是空间,而不是单个单元。

给样本添加一些轻微的扰动,会导致神经网络模型错误分类,这些样本就称为对抗样本



相关工作

我们发现对抗样本相当鲁棒的,并且由具有不同层数、激活或在训练数据的不同子集上训练的神经网络共享。也就是说,如果我们使用一个神经网络来生成一组对抗样本,我们会发现这些示例对于另一个神经网络来说仍然很难正确分类,即使他们是用不同的超参数训练的。一组例子。这些结果表明,通过反向传播学习的深度神经网络具有非直觉特性和固有盲点,其结构以非显而易见的方式与数据分布相关。

Units of: φ(x)

传统的计算机视觉系统依赖于特征提取:通常单个特征很容易解释,例如颜色的直方图,或量化的局部导数。这允许人们检查特征空间的各个坐标,并将它们链接回输入域中有意义的变化。以前的工作中使用了类似的推理,试图分析应用于计算机视觉问题的神经网络。这些作品将隐藏单元的激活解释为有意义的特征。他们寻找能够最大化这个单一特征的激活值的输入图像

可以将上述技术正式表示为图像x’的视觉检查,这些图像满足(或接近最大可达到值):
在这里插入图片描述
其中 I 是网络未训练的数据分布中的一组保留图像,ei 是与第 i 个隐藏单元相关联的自然基向量。我们的实验表明,任何随机方向

### The Curious Case of Neural Text Generation or Processing Neural text generation and processing have become pivotal in the advancement of artificial intelligence, particularly in the domain of natural language processing (NLP). The introduction of the Transformer architecture, as detailed in the seminal paper "Attention is All You Need" by Vaswani et al., marked a significant shift in the design of neural network models for sequence-to-sequence tasks. This architecture relies entirely on self-attention mechanisms to draw global dependencies between input and output sequences, thereby eliminating the need for traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs) [^1]. One of the peculiar cases in the field involves the application of these models in enterprise settings, such as the scenario where Alex, a professional in the tech industry, was involved in fine-tuning a large language model (LLM) for a financial client. The objective was to analyze contracts and detect compliance risks, which required the integration of multi-component processing (MCP) to handle text, tables, and metadata. The challenge here was twofold: managing high computational costs and addressing data privacy concerns. To tackle these issues, Alex's team employed Low-Rank Adaptation (LoRA) for efficient fine-tuning and implemented differential privacy techniques to safeguard sensitive information. An example of how differential privacy was applied is shown in the following Python code snippet: ```python import torch def add_dp_noise(data, epsilon=1.0): noise = torch.normal(0, 1/epsilon, size=data.shape) return data + noise ``` This approach not only achieved General Data Protection Regulation (GDPR) compliance but also reduced training costs by 40% [^2]. In the realm of neural text generation, evaluating the quality of generated text is another intriguing aspect. Evaluation criteria can include coherence, character development, language quality, emotional impact, and originality. The MIT Media Lab has developed an AI narrative evaluation framework with 12 metrics, which is gradually being adopted by the industry [^3]. Moreover, there are specific techniques used in neural text generation to handle the probability distribution over the vocabulary. For instance, a modified probability distribution $ P'(x) $ can be defined as follows: $$ P'(x) = \begin{cases} \frac{P(x)}{\alpha} & \text{if } x \in \text{已生成token} \\ P(x) & \text{otherwise} \end{cases} $$ Here, $ P(x) $ represents the original probability distribution, $ \alpha $ is a scaling factor, and the condition checks if the token $ x $ has already been generated. This technique adjusts the probability of already generated tokens to influence the diversity and quality of the output text [^4]. ###
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

中南大学苹果实验室

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值