tensorflow Federated: 本地模型训练,无需上传训练数据

google 刚发布了 TFF 框架,全名是 TensorFlow Federated,它是干什么的呢,可以大概总结下:

边缘设备(比如:手机)在本地 利用本地数据 训练模型,进而把 本地训练的模型参数 上传服务器,然后 服务器对 各个边缘设备 上传 的模型参数进行 聚合。

为什么 需要这样做呢? 

目前大家 主要的 做法是 把 各种数据收集至 server,然后 利用 汇总的 数据进行模型训练,但是 在 手机数据时,大家可能 会遇到 一些 敏感信息 不能收集或者收集困难的 问题,这可能 会导致 最终模型 的 性能 受到影响。

而TFF的出现,解决了这种敏感信息 上传 的问题。

下面是对 TFF的说明

原文地址:https://medium.com/tensorflow/introducing-tensorflow-federated-a4147aa20041?linkId=64497175

Introducing TensorFlow Federated

Posted by Alex Ingerman (Product Manager) and Krzys Ostrowski (Research Scientist)

There are an estimated 3 billion smartphones in the world, and 7 billion connected devices. These phones and devices are constantly generating new data. Traditional analytics and machine learning need that data to be centrally collected before it is processed to yield insights, ML models and ultimately better products. This centralized approach can be problematic if the data is sensitive or expensive to centralize. Wouldn’t it be better if we could run the data analysis and machine learning right on the devices where that data is generated, and still be able to aggregate together what’s been learned?

TensorFlow Federated (TFF) is an open source framework for experimenting with machine learning and other computations on decentralized data. It implements an approach called Federated Learning (FL), which enables many participating clients to train shared ML models, while keeping their data locally. We have designed TFF based on our experiences with developing the federated learning technology at Google, where it powers ML models for mobile keyboard predictions and on-device search. With TFF, we are excited to put a flexible, open framework for locally simulating decentralized computations into the hands of all TensorFlow users.

To illustrate the use of FL and TFF, let’s start with one of the most famous image datasets: MNIST. The original NIST dataset, from which MNIST was created, contains images of 810,000 handwritten digits, collected from 3,600 volunteers — and our task is to build an ML model that will recognize the digits. The traditional way we’d go about it is to apply an ML algorithm to the entire dataset at once. But what if we couldn’t combine all that data together — for example, because the volunteers did not agree to uploading their raw data to a central server?

With TFF, we can express an ML model architecture of our choice, and then train it across data provided by all writers, while keeping each writer’s data separate and local. We show how to do that below with TFF’s Federated Learning (FL) API, using a version of the NIST dataset that has been processed by the Leaf project to separate the digits written by each volunteer.

You can see the rest in the federated MNIST classifications tutorial.

In addition to the FL API, TFF comes with a set of lower-level primitives, which we call the Federated Core (FC) API. This API enables the expression of a broad range of computations over a decentralized dataset. Training an ML model with federated learning is one example of a federated computation; evaluating it over decentralized data is another.

Let’s take a look at the FC API with a simple example. Suppose we have an array of sensors capturing temperature readings, and want to compute the average temperature across these sensors, without uploading their data to a central location. With FC API, we can express a new data type, specifying its underlying data (tf.float32) and where that data lives (on distributed clients).

 

And then specify a federated average function over that type.

After the federated computation is defined, TFF represents it in a form that could be run in a decentralized setting. TFF’s initial release includes a local-machine runtime that simulates the computation being executed across a set of clients holding the data, with each client computing their local contribution, and the centralized coordinator aggregating all the contributions. From the developer’s perspective, though, the federated computation can be seen as an ordinary function, that happens to have inputs and outputs that reside in different places (on individual clients and in the coordinating service, respectively).

Expressing a simple variant of the Federated Averaging algorithm is also straightforward using TFF’s declarative model:

With TensorFlow Federated, we are taking a step towards making the technology accessible to a wider audience, and inviting community participation in developing federated learning research on top of an open, flexible platform. You can try out TFF in your browser, with just a few clicks, by walking through the tutorials. There are many ways to get involved: you can experiment with existing FL algorithms on your models, contribute new federated datasets and models to the TFF repository, add implementations of new FL algorithms, or extend existing ones with new features.

Over time, we’d like TFF runtimes to become available for the major device platforms, and to integrate other technologies that help protect sensitive user data, including differential privacy for federated learning (integrating with TensorFlow Privacy) and secure aggregation. We look forward to developing TFF together with the community, and enabling every developer to use federated technologies.

Ready to get started? Please visit https://www.tensorflow.org/federated/ and try out TFF today!

Acknowledgments

Creating TensorFlow Federated was a team effort. Special thanks to Brendan McMahan, Keith Rush, Michael Reneer, and Zachary Garrett, who all made significant contributions.

也可以关注我的知乎和微信公众号

知乎: https://zhuanlan.zhihu.com/albertwang

 

微信公众号:AI-Research-Studio

 

下面是赞赏码

 

### TensorFlow Federated 不同版本及其依赖关系 TensorFlow Federated (TFF) 是一种用于实现联邦学习的框架,其不同版本可能具有不同的依赖库和兼容性要求。以下是关于 TFF 版本及其依赖关系的一些重要信息: #### 安装特定版本的 TensorFlow Federated 为了安装某个具体版本的 TFF,通常需要指定该版本以及与其兼容的相关依赖库。例如,在某些情况下,`pip install tensorflow_federated` 可能会因为默认最新版不匹配环境而导致失败[^2]。因此,推荐使用如下命令来安装特定版本: ```bash pip install tensorflow_federated==<version> ``` #### 依赖库分析 以下是一些常见 TFF 版本的主要依赖库列表(基于官方文档和其他社区反馈): 1. **TFF v0.8.0** - `tensorflow`: 需要与 TensorFlow 的稳定版本保持一致,通常是 `tensorflow>=1.13,<2.0` 或更高版本。 - `numpy`: 提供数值计算支持,建议版本为 `numpy>=1.16`. - `six`: Python 2 和 Python 3 兼容性的辅助库,需满足 `six>=1.10`. 2. **TFF v0.15.0** - `tensorflow`: 推荐使用 `tensorflow>=2.2,<2.3`。 - `absl-py`: 日志记录和标志解析的支持库,需满足 `absl-py>=0.7.0`。 - `grpcio`: gRPC 支持,需满足 `grpcio>=1.24.3`。 3. **TFF v0.22.0** - `tensorflow`: 推荐使用 `tensorflow>=2.4,<2.5`。 - `protobuf`: 协议缓冲区支持,需满足 `protobuf>=3.13.0`。 - `pandas`: 数据处理支持,需满足 `pandas>=1.0.0`. #### 使用 setup.py 进行依赖管理 对于更复杂的开发场景,可以通过 `setup.py` 文件手动配置并安装 TFF 所需的所有依赖项。这种方法允许用户灵活调整依赖版本以适应本地环境需求[^3]。 #### 示例代码:验证当前环境中已安装的依赖 如果希望确认当前环境中是否已经正确安装了所需依赖,可以运行以下脚本来检测: ```python import pkg_resources required_packages = [ 'tensorflow', 'numpy', 'six', 'absl-py', 'grpcio', 'protobuf', 'pandas' ] missing_packages = [] for package in required_packages: try: dist = pkg_resources.get_distribution(package) print(f"{package} ({dist.version}) is installed.") except pkg_resources.DistributionNotFound: missing_packages.append(package) if missing_packages: print("The following packages are not installed:", ", ".join(missing_packages)) else: print("All required packages are installed successfully!") ```
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值