论文阅读笔记——D3: Differential Testing of Distributed Deep Learning With Model Generation

D3 论文
以往研究关注训练好的模型是否存在错误行为,本文关注构建和训练模型的代码本身是否存在问题。构建了一个分布式等价规则:1)分布式与非分布式等价(在单 GPU 与多 GPU 上训练的模型等价);2)不同分布式设置等价(使用列式分片(column-wise sharding)训练的模型和使用行式分片(row-wise sharding)训练的模型,对于相同的输入,输出应该是等价的)。
在这里插入图片描述

分布式参数:

  • 世界大小(world size):分布式集群中的处理器数量,例如在 GPU 训练中,就是 GPU 的数量。
  • 分片类型(sharding type):模型分片的方式,例如列式分片和行式分片。
  • 设备(device):使用的硬件设备,如 GPU 或 CPU。
  • 权重量化(weight quantization):模型权重的量化方式,例如 int8。
  • 激活量化(activation quantization):激活函数的量化方式,例如 float16。
  • 分片器类型(sharder type):用于分片的具体工具或方法,例如 EmbeddingBagSharder。
    通过系统地生成和组合分布式参数的候选值,D3能够生成大量多样化的分布式设置。这些设置用于测试分布式深度学习库,确保在不同设置下模型的输出等价。
    在这里插入图片描述
类型 方法
DLRM 类结构 变异稀疏组件(Embedding)、密集组件(线性层)、交互组件(均值、sigmoid)
其他结构 使用 Muffin 生成链式结构和基于单元的结构模型
层冻结模型 随机冻结模型中的一个层(trainable=False或requires_grad=False),测试层冻结模型以检测转换引入的不一致性
模型输入 根据模型的输入层参数生成输入
Physics-informed DeepONets represent a novel approach in the intersection of machine learning and computational physics, particularly for solving parametric partial differential equations (PDEs) by learning their solution operators. These networks integrate physical laws and constraints directly into the training process, thereby enhancing the model&#39;s ability to generalize and maintain accuracy across different parameter configurations. DeepONets, or Deep Operator Networks, are designed to approximate nonlinear operators, which are mappings between infinite-dimensional spaces. In the context of PDEs, these operators map input parameters (such as coefficients, boundary conditions, or source terms) to the corresponding solutions of the PDEs. By embedding the physics of the problem into the loss function, physics-informed DeepONets ensure that the learned solutions adhere to the underlying physical principles, even when limited data is available for training. The architecture of a physics-informed DeepONet typically consists of two main components: a branch net and a trunk net. The branch net processes the input parameters, while the trunk net handles the spatial or temporal coordinates where the solution is to be predicted. The outputs of these two networks are combined through an inner product to produce the final output, which represents the solution of the PDE at specific points in the domain. Training a physics-informed DeepONet involves minimizing a loss function that includes both data fidelity terms and terms that enforce the satisfaction of the PDE and its boundary/initial conditions. This is achieved by computing the residuals of the PDE at collocation points using automatic differentiation, a process that does not require explicit knowledge of the solution form. Consequently, the network can be trained without the need for large datasets of precomputed solutions, making it a powerful tool for scenarios where data collection is expensive or time-consuming. One of the key advantages of this method is its ability to handle high-dimensional problems efficiently, as the complexity of the operator approximation scales more favorably with dimensionality compared to traditional discretization-based methods. Moreover, once trained, the DeepONet can rapidly evaluate the solution operator for new parameter values, enabling real-time simulation and uncertainty quantification. ```python # Example of defining a simple DeepONet structure using PyTorch import torch import torch.nn as nn class DeepONet(nn.Module): def __init__(self, branch_layers, trunk_layers): super(DeepONet, self).__init__() self.branch = self._build_network(branch_layers) self.trunk = self._build_network(trunk_layers) def _build_network(self, layers): network = [] for i in range(len(layers)-1): network.append(nn.Linear(layers[i], layers[i+1])) if i < len(layers) - 2: network.append(nn.ReLU()) return nn.Sequential(*network) def forward(self, branch_input, trunk_input): b = self.branch(branch_input) t = self.trunk(trunk_input) return torch.einsum(&#39;bi,ti->bt&#39;, b, t) # Instantiation of a DeepONet with specified layer sizes model = DeepONet(branch_layers=[10, 40, 40, 40], trunk_layers=[2, 40, 40, 40]) ``` The application of physics-informed DeepONets extends beyond academic interest; they have practical implications in fields such as fluid dynamics, solid mechanics, and financial modeling, where parametric PDEs are prevalent. By leveraging the strengths of deep learning and incorporating physical knowledge, these networks offer a promising avenue for addressing complex, real-world problems that were previously computationally prohibitive.
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值