Privacy-preserving Fine-tuning of Large Language Models through Flatness

本文是LLM系列文章,针对《Privacy-preserving Fine-tuning of Large Language Models through Flatness》的翻译。

通过平坦度对大型语言模型进行隐私保护微调

摘要

最近,随着ChatGPT等大型语言模型的发展,与大型语言模型(LLM)的使用相关的隐私问题日益严重。在现有工作中探索了差分隐私(DP)技术,以降低其隐私风险,代价是泛化能力下降。我们的论文揭示了DP训练模型的损失景观的平坦性在其隐私性和泛化之间的权衡中起着至关重要的作用。我们进一步提出了一个整体框架来实施适当的权重平坦性,这大大提高了具有竞争性隐私保护的模型泛化能力。它从三个粗略到细粒度进行了创新,包括对层内模型权重的扰动感知最小-最大优化、对跨层权重的平坦度引导稀疏前缀调整,以及DP和非DP权重副本之间的权重知识提取。对黑盒和白盒场景进行了综合实验,以证明我们的建议在增强泛化和保持DP特性方面的有效性。例如,在文本分类数据集QNLI上,DP-Flat在非私有完全微调的情况下实现了类似的性能,但在隐私预算ε=3的情况下具有DP保证,并且在更高的隐私预算下具有更好的性能。附录中提供了代码。

1 引言

2 相关工作

3 方法

4 实验

5 结论

在本文中,我们通过差分隐私(DP)解决了在大型语言模型(LLM)中平衡数据隐私与性能的挑战。我们引入了一个新的框架,旨在增强DP训练模型中损失景观的平坦性,提出了三个层次的策略:层内平坦化、跨层平坦化和跨模型平坦化。我们的方法有效地缩小了DP训练的LLM与其标准LLM之间的性能差距,为封闭源代码环境中的隐私保护算法提供了开创性的解决方案。我们的综合实验表明,在黑盒和白盒设

Privacy-preserving machine learning is becoming increasingly important in today's world where data privacy is a major concern. Federated learning and secure aggregation are two techniques that can be used to achieve privacy-preserving machine learning. Federated learning is a technique where the machine learning model is trained on data that is distributed across multiple devices or servers. In this technique, the model is sent to the devices or servers, and the devices or servers perform the training locally on their own data. The trained model updates are then sent back to a central server, where they are aggregated to create a new version of the model. The key advantage of federated learning is that the data remains on the devices or servers, which helps to protect the privacy of the data. Secure aggregation is a technique that can be used to protect the privacy of the model updates that are sent to the central server. In this technique, the updates are encrypted before they are sent to the central server. The central server then performs the aggregation operation on the encrypted updates, and the result is sent back to the devices or servers. The devices or servers can then decrypt the result to obtain the updated model. By combining federated learning and secure aggregation, it is possible to achieve privacy-preserving machine learning. This approach allows for the training of machine learning models on sensitive data while protecting the privacy of the data and the model updates.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值