torch.nn.DataParallel实现深度学习求解PDE数据并行

本文介绍了使用PFNN算法解决泊松方程,探讨了在高维问题中应对策略,并展示了如何利用torch.nn.DataParallel进行数据并行训练。此外,还提到了在MNIST数据集上的数据并行代码示例以及存在的问题和解决方案,如使用deepspeed进行更高效训练。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

泊松方程

{ − Δ u = f ,  in  Ω , u = g ,  on  ∂ Ω , \left\{\begin{aligned} -\Delta u &=f, \text { in } \Omega, \\ u &= g ,\text { on }\partial \Omega,\\ \end{aligned}\right. {Δuu=f, in Ω,=g, on Ω,
这里我们选取 Ω = [ − 1 , 1 ] d \Omega = [-1,1]^d Ω=[1,1]d,其中 d d d表示区域维度,这里我们使用PFNN算法来求解这个问题。
这里采用深度ritz算法,构建的变分问题为:
{ min ⁡ u 1 2 ∫ Ω ∣ ∇ u ∣ 2 − 2 f u d Ω ,  s.t.   u = g ,  on  ∂ Ω , \left\{\begin{aligned} &\min_{u} \frac{1}{2} \int_{\Omega} |\nabla u|^2 - 2 f u d\Omega,\\ &\text { s.t. } \ u = g ,\text { on }\partial \Omega, \end{aligned}\right. umin21Ω∣∇u22fudΩ, s.t.  u=g, on Ω,
具体可以参考论文
PFNN: A penalty-free neural network method for solving a class of second-order boundary-value problems on complex geometries
PFNN算法的思想是给出两个网络netg,netf和长度因子 ℓ \ell 来组合得到近似解。长度因子需要满足在边界上取值为0,在内部区域恒大于0.
{ ℓ ( x ) > 0 ,    Ω , ℓ ( x ) = 0 ,    ∂ Ω . \left\{\begin{array}{l} \ell(\mathbf{x}) > 0, \quad \;\Omega,\\ \ell(\mathbf{x}) = 0 ,\quad \;\partial \Omega. \end{array}\right. {(x)>0,Ω,(x)=0,Ω.
近似解为: u = n e t g + ℓ × n e t f u=netg + \ell \times netf u=netg+×netf,利用netg训练边界数据,构建均方误差函数,训练完以后netg满足 n e t g ( x ) = g , x ∈ ∂ Ω netg(\mathbf{x})=g,\mathbf{x} \in \partial \Omega netg(x)=g,xΩ,然后训练下一个神经网络netf。
注意netg训练完以后,那么 u = n e t g + ℓ × n e t f u=netg + \ell \times netf u=netg+×netf就会自动满足边界条件,所以此时训练netf只需要想办法优化 1 2 ∫ Ω ∣ ∇ u ∣ 2 − 2 f u d Ω \frac{1}{2} \int_{\Omega} |\nabla u|^2 - 2 f u d\Omega 21Ω∣∇u22fudΩ即可,这个思想非常简单。

PFNN代码

重点注意,对于低维问题,PFNN很容易奏效,但是对于高维问题,此时需要扩大网络规模以及重新调节学习率等参数,最重要的是,高维问题在求解过程中需要反复重新采样以避免过拟合
这里以 d = 10 d=10 d=10举例子,如果我们在区域内部采集1000个样本点,扩散到每个维度,那么每个维度最多采集了2个点,因为 2 10 = 1024 2^{10}=1024 210=1024,这样导致样本数目过小,如果一开始每个维度采集20个点,那么需要采集的样本数目就是 2 0 10 20^{10} 2010,此时样本数目过于庞大,一定会导致训练特别耗时。为此可以采取的策略是,每训练一定次数,就重新撒点,对于边界点也是类似的,这个就是INSET和BDSET里面reset函数的作用。

注意,在形成损失函数loss_in的时候,此时netg已经训练结束,理论上应该有netg.forward(bdset.x)=g, b d s e t . x ∈ ∂ Ω bdset.x \in \partial \Omega bdset.xΩ,因此在训练netf的时候,为了加速训练过程,我们采取分部微分求导。loss_in里面的inset.L正是长度因子函数 ℓ ( i n s e t . x ) \ell(inset.x) (inset.x),inset.Lx是inset.L的微分,inset.G是netg在内点inset.X的取值,inset.Gx是inset.G的微分,根据分布微分,我们有 u x = n e t g x + ℓ x × n e t f + ℓ × n e t f x u_x=netg_x + \ell_x \times netf + \ell \times netf_x ux=netgx+x×netf+×netfx

数据并行

主要使用torch.nn.DataParallel库完成,一般来说需要多张显卡才有实际意义,不过不用显卡也能运行,此时就是串行代码。核心在于定义神经网络以后,需要调用model = nn.DataParallel(model)即可自动操作。
下面不妨以MNIST数据集的数据并行代码先举例子
运行代码最好事先指定显卡数目,有的代码并不适合显卡过多的情况
export CUDA_VISIBLE_DEVICES=“0,1”
python mnist.py

mnist.py

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import time
# 定义神经网络模型
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = nn.Linear(784, 10)
    
    def forward(self, x):
        x = x.view(-1, 784)
        x = self.fc(x)
        return x

# 定义训练函数
def train(device):
    # 加载训练数据
    train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True,
                                               transform=transforms.ToTensor())
    train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)

    # 创建模型和优化器
    model = Net().to(device)
    model = nn.DataParallel(model)  # 将模型包装为DataParallel模型
    optimizer = optim.SGD(model.parameters(), lr=0.01)

    # 训练模型
    for epoch in range(5):
        for i, (inputs, labels) in enumerate(train_loader):
            inputs = inputs.to(device)
            labels = labels.to(device)

            optimizer.zero_grad()
            outputs = model(inputs)
            loss = nn.CrossEntropyLoss()(outputs, labels)
            loss.backward()
            optimizer.step()
            if i % 100 == 0:
                print(f"Epoch: {epoch}, Batch: {i}, Loss: {loss.item()}")

if __name__ == "__main__":
    st = time.time()
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    train(device)
    ela = time.time() - st
    print('GPUs:%d,time:%.2f\n'%(torch.cuda.device_count(),ela))

PFNN数据切分,不做数据并行训练

这部分仅仅训练网络过程中把训练数据切分一次一次给优化器,但是始终在一块显卡上,不开辟多显卡,不做数据并行训练。

这个代码运行结果正确

import torch
from torch.utils.data import Dataset,DataLoader
import deepspeed
import time
import numpy as np
import matplotlib.pyplot as plt
import torch.nn as nn
from scipy.stats import qmc

import argparse

def UU(X, order,prob):
    if prob==1:
        temp = 10*(X[:,0]+X[:,1])**2 + (X[:,0]-X[:,1])**2 + 0.5
        if order[0]==0 and order[1]==0:
            return torch.log(temp)
        if order[0]==1 and order[1]==0:
            return temp**(-1) * (20*(X[:,0]+X[:,1]) + 2*(X[:,0]-X[:,1]))
        if order[0]==0 and order[1]==1:
            return temp**(-1) * (20*(X[:,0]+X[:,1]) - 2*(X[:,0]-X[:,1]))
        if order[0]==2 and order[1]==0:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])+2*(X[:,0]-X[:,1])) ** 2 \
                   + temp**(-1) * (22)
        if order[0]==1 and order[1]==1:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])+2*(X[:,0]-X[:,1])) \
                   * (20*(X[:,0]+X[:,1])-2*(X[:,0]-X[:,1])) \
                   + temp**(-1) * (18)
        if order[0]==0 and order[1]==2:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])-2*(X[:,0]-X[:,1])) ** 2 \
                   + temp**(-1) * (22)
    if prob==2:
        temp1 = X[:,0]*X[:,0] - X[:,1]*X[:,1]
        temp2 = X[:,0]*X[:,0] + X[:,1]*X[:,1] + 0.1
        if order[0]==0 and order[1]==0:
            return temp1 * temp2**(-1)
        if order[0]==1 and order[1]==0:
            return (2*X[:,0]) * temp2**(-1) + \
                   temp1 * (-1)*temp2**(-2) * (2*X[:,0])
        if order[0]==0 and order[1]==1:
            return (-2*X[:,1]) * temp2**(-1) + \
                   temp1 * (-1)*temp2**(-2) * (2*X[:,1])
        if order[0]==2 and order[1]==0:
            return (2) * temp2**(-1) + \
                   2 * (2*X[:,0]) * (-1)*temp2**(-2) * (2*X[:,0]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,0])**2 + \
                   temp1 * (-1)*temp2**(-2) * (2)
        if order[0]==1 and order[1]==1:
            return (2*X[:,0]) * (-1)*temp2**(-2) * (2*X[:,1]) + \
                   (-2*X[:,1]) * (-1)*temp2**(-2) * (2*X[:,0]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,0]) * (2*X[:,1])
        if order[0]==0 and order[1]==2:
            return (-2) * temp2**(-1) + \
                   2 * (-2*X[:,1]) * (-1)*temp2**(-2) * (2*X[:,1]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,1])**2 + \
                   temp1 * (-1)*temp2**(-2) * (2)
    if prob==3:
        temp = torch.exp(-4*X[:,1]*X[:,1])
        if order[0]==0 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * temp + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * temp
        if order[0]==1 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * (4*(X[:,0]+1)**3) * temp + \
                   (1-ind) * (4*(-X[:,0]+1)**3) * temp
        if order[0]==0 and order[1]==1:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * (temp*(-8*X[:,1])) + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * (temp*(-8*X[:,1]))
        if order[0]==2 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * (12*(X[:,0]+1)**2) * temp + \
                   (1-ind) * (-12*(-X[:,0]+1)**2) * temp
        if order[0]==1 and order[1]==1:
            ind = (X[:,0]<=0).float()
            return ind * (4*(X[:,0]+1)**3) * (temp*(-8*X[:,1])) + \
                   (1-ind) * (4*(-X[:,0]+1)**3) * (temp*(-8*X[:,1]))
        if order[0]==0 and order[1]==2:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * (temp*(64*X[:,1]*X[:,1]-8)) + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * (temp*(64*X[:,1]*X[:,1]-8))

def FF(X,prob):
    return -UU(X,[2,0],prob) - UU(X,[0,2],prob)

class INSETdata(Dataset):
    def __init__(self,size_tr,bound,dtype,prob):
        self.dim = bound.shape[0]
        self.size_tr = size_tr
        self.bound = bound
        self.dtype = dtype
        self.prob = prob
        
        self.quasi_samples()
        
    def quasi_samples(self):
        sampler = qmc.Sobol(d=self.dim)
        sample = sampler.random(n=self.size_tr)
        
        for i in range(self.dim):
            sample[:,i] = sample[:,i]*(self.bound[i,1] - self.bound[i,0]) + self.bound[i,0]
        
        self.X = torch.tensor(sample,dtype=self.dtype)
        
        self.u_acc = UU(self.X,[0,0],self.prob).data.reshape(-1,1)
        self.ff = FF(self.X,self.prob).data.reshape(-1,1)
        
    def __len__(self):
        return self.size_tr
    def __getitem__(self,index):
        xx = self.X[index]
        ff = self.ff[index]
        acc = self.u_acc[index]
        return xx,ff,acc
class BDSETdata(Dataset):
    def __init__(self,bound,nx,dtype,prob):
        self.dim = bound.shape[0]
        self.hx = [(bound[0,1] - bound[0,0])/nx[0],(bound[1,1] - bound[1,0])/nx[1]]
        self.size = 2*(nx[0] + nx[1])
        self.X = torch.zeros(self.size,self.dim)#储存内点
        m = 0
        for i in range(nx[0]):
            self.X[m,0] = bound[0,0] + (i + 0.5)*self.hx[0]
            self.X[m,1] = bound[1,0] 
            m = m + 1
        for j in range(nx[1]):
            self.X[m,0] = bound[0,1]
            self.X[m,1] = bound[1,0] + (j + 0.5)*self.hx[1]
            m = m + 1
        for i in range(nx[0]):
            self.X[m,0] = bound[0,0] + (i + 0.5)*self.hx[0]
            self.X[m,1] = bound[1,1] 
            m = m + 1
        for j in range(nx[1]):
            self.X[m,0] = bound[0,0]
            self.X[m,1] = bound[1,0] + (j + 0.5)*self.hx[1]
            m = m + 1
        self.X = self.X.type(dtype)
        self.u_acc = UU(self.X,[0,0],prob).reshape(-1,1)
    def __len__(self):
        return self.size
    def __getitem__(self,index):
        xx = self.X[index]
        acc = self.u_acc[index]
        return xx,acc
class LEN():
    def __init__(self,bound):
        self.bound = bound
        self.dim = bound.shape[0]
        self.hx = bound[:,1] - bound[:,0]
        self.mu = self.dim
    def forward(self,X):
        L = 1.0
        for i in range(self.dim):
            tmp1 = (X[:,i] - self.bound[i,0])/self.hx[i]
            tmp2 = (self.bound[i,1] - X[:,i])/self.hx[i]
            L = L*(1 - (1 - tmp1)**self.mu)*(1 - (1 - tmp2)**self.mu)
        return L.reshape(-1,1)
def L1_error(u_pred,u_acc):
    u_pred = u_pred
    u_acc = u_acc
    return max(abs(u_pred - u_acc))
np.random.seed(1234)
torch.manual_seed(1234)
class Net(torch.nn.Module):
    def __init__(self, layers, dtype):
        super(Net, self).__init__()
        self.dtype = dtype
        self.layers = layers
        self.layers_hid_num = len(self.layers)-2
        fc = []
        for i in range(self.layers_hid_num+1):
            fc.append(torch.nn.Linear(self.layers[i],self.layers[i+1]))
        self.fc = torch.nn.Sequential(*fc)
        for i in range(self.layers_hid_num+1):
            self.fc[i].weight.data = self.fc[i].weight.data.type(dtype)
            self.fc[i].bias.data = self.fc[i].bias.data.type(dtype)
    def forward(self, x):
        dev = x.device
        
        for i in range(self.layers_hid_num):
            h = torch.sin(self.fc[i](x))
            temp = torch.eye(x.shape[-1],self.layers[i+1],dtype = self.dtype,device = dev)
            x = h + x@temp
        return self.fc[-1](x) 
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])
def pred_u(netg,netf,lenth,X):
    return netf.forward(X)*lenth.forward(X) + netg.forward(X)
def Loss_bd(netg,x,acc):
    res_b = (netg.forward(x) - acc)**2
    return torch.sqrt(res_b + 1e-9).mean()
def Loss_in(netf,x,ff,L,Lx,G,Gx):
    insetF = netf.forward(x)
    insetFx, = torch.autograd.grad(insetF,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
    u = insetF*L + G
    ux = insetFx*L + insetF*Lx + Gx
    res_i = 0.5*((ux**2).sum(1).reshape(-1,1) - 2*ff*u).mean() 
    return res_i
def Traing(device,netg,bdset,batch,optimg,epochg,optimtype):
    model = netg.to(device)
    
    
    for epoch in range(epochg):
        if optimtype == 'LBFGS' or optimtype == 'BFGS':
            for i,(xx,acc) in enumerate(bdset):
                xx = xx.to(device)
                acc = acc.to(device)
            
                def closure():
                    optimg.zero_grad()
                    loss = Loss_bd(model,xx,acc)
                    loss.backward()
                    return loss
                optimg.step(closure) 
        else:
            for i,(xx,acc) in enumerate(bdset):
                xx = xx.to(device)
                acc = acc.to(device)
                
                optimg.zero_grad()
                loss = Loss_bd(model,xx,acc)
                loss.backward()
                optimg.step()
        if epoch%100 == 0:
            print('GPUs:%d,epoch:%d,loss at bound:%.3e\n'%(torch.cuda.device_count(),epoch,loss.item()))
def Trainf(device,netg,netf,lenth,inset,optimf,epochf,optimtype):
    model = netf.to(device)
    
    
    for epoch in range(epochf):
        if optimtype == 'LBFGS' or optimtype == 'BFGS':
            for i,(x,ff,acc) in enumerate(inset):
                x = x.to(device)
                ff = ff.to(device)
                acc = acc.to(device)
                if x.requires_grad == False:
                    x.requires_grad = True
                L = lenth.forward(x)
                G = netg.forward(x)
                Lx, = torch.autograd.grad(L,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                Gx, = torch.autograd.grad(G,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                L = L.data;Lx = Lx.data;G = G.data;Gx = Gx.data                  
                def closure():
                    optimf.zero_grad()
                    loss = Loss_in(model,x,ff,L,Lx,G,Gx)
                    loss.backward()
                    return loss
                optimf.step(closure) 
        else:
            for i,(x,ff,acc) in enumerate(inset):
                x = x.to(device)
                ff = ff.to(device)
                acc = acc.to(device)
                if x.requires_grad == False:
                    x.requires_grad = True
                L = lenth.forward(x)
                G = netg.forward(x)
                Lx, = torch.autograd.grad(L,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                Gx, = torch.autograd.grad(G,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                L = L.data;Lx = Lx.data;G = G.data;Gx = Gx.data
                
                optimf.zero_grad()
                loss = Loss_in(model,x,ff,L,Lx,G,Gx)
                loss.backward()
                optimf.step()
        if epoch%100 == 0:
            u_pre = pred_u(netg,netf,lenth,x)
            err = L1_error(acc,u_pre)
            print('GPUs:%d,epoch:%d,loss inside:%.3e,err:%.3E\n'%(torch.cuda.device_count(),epoch,loss.item(),err))
dtype = torch.float32
size_tr = 1000
prob = 1
nx = [50,50]
lay_wid = 15
bound = np.array([-1,1,-1,1]).reshape(2,2)
batch_in = 100
batch_bd = nx[0]
train_inset = INSETdata(size_tr,bound,dtype,prob)
train_bdset = BDSETdata(bound,nx,dtype,prob)
epochg = 3000
epochf = 2000
lr = 1e-3
inset = DataLoader(INSETdata(size_tr,bound,dtype,prob),batch_size = batch_in,shuffle = True)
bdset = DataLoader(BDSETdata(bound,nx,dtype,prob),batch_size = nx[0],shuffle = True)

lenth = LEN(bound)

layerg = [2,lay_wid,lay_wid,lay_wid,1];netg = Net(layerg,dtype)
layerf = [2,lay_wid,lay_wid,lay_wid,1];netf = Net(layerf,dtype)
#optimtype = 'LBFGS'
optimtype = 'adam'

if optimtype == 'LBFGS':
    optimg = torch.optim.LBFGS(netg.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')
    optimf = torch.optim.LBFGS(netf.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')                  
else:
    optimg = torch.optim.Adam(netg.parameters(),lr=lr,betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
    optimf = torch.optim.Adam(netf.parameters(),lr=lr,betas=(0.9, 0.999), eps=1e-08, weight_decay=0) 

st = time.time()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Traing(device,netg,bdset,batch_bd,optimg,epochg,optimtype)

Trainf(device,netg,netf,lenth,inset,optimf,epochf,optimtype)
netg = netg.to('cpu')
netf = netf.to('cpu')
u_acc = train_inset.u_acc
u_pre = pred_u(netg,netf,lenth,train_inset.X)
err = L1_error(u_acc,u_pre)
print('error:%.3e\n'%(err))
ela = time.time() - st
print('finsh, use time:%.2f\n'%(ela))

在这里插入图片描述

PFNN+Dataparallel

这里展示两个代码,第一个代码训练过程中使用bdset,不对数据做切片,第二个代码训练过程使用Dataloader,对数据做切片。

code:bdset

import torch
import time
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import torch.nn as nn
from scipy.stats import qmc
import bfgs
import argparse
import os
device = 'cpu'
def UU(X,prob):
    tmp = 0
    if prob == 1:
        for i in range(X.shape[1]):
            tmp += X[:,i]**2
        return tmp.reshape(-1,1)
    if prob == 2:
        for i in range(X.shape[1]):
            tmp += torch.cos(np.pi*X[:,i])
        return tmp.reshape(-1,1)
def FF(X,prob):
    tmp = 0
    if prob == 1:
        return -2*X.shape[1]*torch.ones(X.shape[0],1)
    if prob == 2:
        for i in range(X.shape[1]):
            tmp += np.pi*np.pi*torch.cos(np.pi*X[:,i])
            #tmp += np.pi**torch.sin(np.pi*X[:,i])
        return -tmp.reshape(-1,1)
class INSET():
    def __init__(self,size_tr,bound,dtype,prob,device):
        self.dim = bound.shape[0]
        self.size_tr = size_tr
        self.bound = bound
        self.dtype = dtype
        self.prob = prob
        self.device = device
        self.quasi_samples()
        
    def quasi_samples(self):
        sampler = qmc.Sobol(d=self.dim)
        sample = sampler.random(n=self.size_tr)
        
        for i in range(dim):
            sample[:,i] = sample[:,i]*(self.bound[i,1] - self.bound[i,0]) + self.bound[i,0]
        
        self.X = torch.tensor(sample,dtype=self.dtype).to(self.device)
        self.X.requires_grad = True
        self.u_acc = UU(self.X,self.prob)
        self.ff = FF(self.X,self.prob)
        self.u_acc = self.u_acc.to(self.device).data
        self.ff = self.ff.to(self.device).data
        self.weight_grad = torch.ones_like(self.X[:,0:1]).to(self.device)
    
class BDSET():
    def __init__(self,bound,dtype,prob,device):
        self.dim = bound.shape[0]
        
        self.bound = bound
        self.dtype = dtype
        self.prob = prob
        self.device = device
        self.rock = 600
        self.ro = 200
        self.size = self.rock*self.dim
        self.quasi_samples()
    def quasi_samples(self):
        sampler = qmc.Sobol(d=self.dim)
        tmp = sampler.random(n=self.size)
        
        for i in range(self.dim):
            tmp[:,i] = tmp[:,i]*(self.bound[i,1] - self.bound[i,0]) + self.bound[i,0]
        for i in range(self.dim):
            tmp[i*self.rock:i*self.rock + self.ro,i] = self.bound[i,0]
            tmp[i*self.rock + self.ro:(i + 1)*self.rock,i] = self.bound[i,1]
        self.X = torch.tensor(tmp).type(self.dtype).to(self.device)
        self.u_acc = UU(self.X,self.prob)
        self.u_acc = self.u_acc.to(self.device)
np.random.seed(1234)
torch.manual_seed(1234)
class Net(torch.nn.Module):
    def __init__(self, layers, dtype):
        super(Net, self).__init__()
        self.dtype = dtype
        self.layers = layers
        self.layers_hid_num = len(self.layers)-2
        fc = []
        for i in range(self.layers_hid_num+1):
            fc.append(torch.nn.Linear(self.layers[i],self.layers[i+1]))
        self.fc = torch.nn.Sequential(*fc)
        for i in range(self.layers_hid_num+1):
            self.fc[i].weight.data = self.fc[i].weight.data.type(dtype)
            self.fc[i].bias.data = self.fc[i].bias.data.type(dtype)
    def forward(self, x):
        dev = x.device
        
        for i in range(self.layers_hid_num):
            h = torch.sin(self.fc[i](x))
            temp = torch.eye(x.shape[-1],self.layers[i+1],dtype = self.dtype,device = dev)
            x = h + x@temp
        return self.fc[-1](x) 
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])

class LEN():
    def __init__(self,bound):
        self.bound = bound
        self.dim = bound.shape[0]
        self.hx = bound[:,1] - bound[:,0]
        self.mu = self.dim
    def forward(self,X):
        L = 1.0
        for i in range(self.dim):
            tmp1 = (X[:,i] - self.bound[i,0])/self.hx[i]
            tmp2 = (self.bound[i,1] - X[:,i])/self.hx[i]
            L = L*(1 - (1 - tmp1)**self.mu)*(1 - (1 - tmp2)**self.mu)
        return L.reshape(-1,1)
def L1_error(u_pred,u_acc):
    u_pred = u_pred.to(device)
    u_acc = u_acc.to(device)
    return max(abs(u_pred - u_acc))
def pred_u(netg,netf,lenth,X):
    return netf.forward(X)*lenth.forward(X) + netg.forward(X)
def Loss_bd(netg,bdset):
    bdset_cuda(bdset)
    bdset.res_u = (netg.forward(bdset.X) - bdset.u_acc)**2
    return torch.sqrt(bdset.res_u.mean())
def Loss_in(netf,inset):
    
    if inset.X.requires_grad == False:
        inset.X.requires_grad = True
    insetF = netf.forward(inset.X)
    insetFx, = torch.autograd.grad(insetF,inset.X,create_graph = True,retain_graph = True,
                                  grad_outputs = inset.weight_grad)
    #print(insetF.device,inset.L.device,inset.G.device)                          
    inset.u = insetF*inset.L + inset.G
    inset.ux = insetFx*inset.L + insetF*inset.Lx + inset.Gx                            
   
    inset.res_u = 0.5*((inset.ux**2).sum(1).reshape(-1,1) - 2*inset.ff*inset.u).mean() 
                      
    return inset.res_u
def Traing(netg,bdset,optimtype,optimg,epochg):
    print('train neural network g')
    t0 = time.time()
    lossoptimal = Loss_bd(netg,bdset)
    for it in range(epochg):
        if it%20 == 0:
            bdset.quasi_samples()
            bdset_cuda(bdset)
        st = time.time()
        if optimtype == 'LBFGS' or optimtype == 'BFGS':
            def closure():
                optimg.zero_grad()
                loss = Loss_bd(netg,bdset)
                loss.backward()
                return loss
            optimg.step(closure) 
            
        else:
            for j in range(100):
                optimg.zero_grad()
                loss = Loss_bd(netg,bdset)
                loss.backward()
                optimg.step()
        loss = Loss_bd(netg,bdset)
        if loss < lossoptimal:
            fnameg = 'netg.pt'
            torch.save(netg,fnameg) 
        ela = time.time() - st
        
        print('epoch:%d,lossg:%.3e,time:%.2f'%(it,loss.item(),ela))
def Trainf(netg,netf,inset,optimtype,optimf,epochf):
    print('train neural network f')
    t0 = time.time()
    lossoptimal = Loss_in(netf,inset)
    for it in range(epochf):
        inset.quasi_samples()
        inset_cuda(inset)
        inset.L = lenth.forward(inset.X)
        inset.Lx, = torch.autograd.grad(inset.L,inset.X,create_graph = True,retain_graph = True,
                                  grad_outputs = inset.weight_grad)
        inset.L = inset.L.data; inset.Lx = inset.Lx.data
        
        
        inset.G = netg.forward(inset.X)
        inset.Gx, = torch.autograd.grad(inset.G,inset.X,create_graph = True,retain_graph = True,
                                  grad_outputs = inset.weight_grad)
        inset.G = inset.G.data; inset.Gx = inset.Gx.data
        st = time.time()
        if optimtype == 'LBFGS' or optimtype == 'BFGS':
            def closure():
                optimf.zero_grad()
                loss = Loss_in(netf,inset)
                loss.backward()
                return loss
            optimf.step(closure) 
            
        else:
            for j in range(100):
                optimf.zero_grad()
                loss = Loss_in(netf,inset)
                loss.backward()
                optimf.step()
        loss = Loss_in(netf,inset)
        if loss < lossoptimal:
            fnamef = 'netf.pt'
            torch.save(netf,fnamef)
        ela = time.time() - st
        err = L1_error(inset.u,inset.u_acc)
        print('GPUs:%d,epoch:%d,lossf:%.2e,error:%.3e,time:%.2f'%
            (torch.cuda.device_count(),it,loss.item(),err,ela))
def Train(netg,netf,lenth,inset,bdset,optimtype,optimg,optimf,epochg,epochf):
    if inset.X.requires_grad == False:
        inset.X.requires_grad = True
    inset_cuda(inset)
    inset.L = lenth.forward(inset.X)
    inset.Lx, = torch.autograd.grad(inset.L,inset.X,create_graph = True,retain_graph = True,
                                  grad_outputs = inset.weight_grad)
    inset.L = inset.L.data; inset.Lx = inset.Lx.data
    Traing(netg,bdset,optimtype,optimg,epochg)
    netg = torch.load('netg.pt')
    
    inset.G = netg.forward(inset.X)
    inset.Gx, = torch.autograd.grad(inset.G,inset.X,create_graph = True,retain_graph = True,
                                  grad_outputs = inset.weight_grad)
    inset.G = inset.G.data; inset.Gx = inset.Gx.data             
    Trainf(netg,netf,inset,optimtype,optimf,epochf)  
    u = pred_u(netg,netf,lenth,inset.X)
    inset.Ux, = torch.autograd.grad(u,inset.X,create_graph = True,retain_graph = True,
                                  grad_outputs = inset.weight_grad)
    er = (inset.Ux - inset.ux).reshape(-1,1)
    print(max(abs(er)),max(abs(bdset.u_acc - pred_u(netg,netf,lenth,bdset.X))))
def bdset_cuda(bdset):
    bdset.X = bdset.X.cuda()
    bdset.u_acc = bdset.u_acc.cuda()
    
def inset_cuda(inset):
    
    inset.X = inset.X.cuda()
    inset.u_acc = inset.u_acc.cuda()
    inset.ff = inset.ff.cuda()
    inset.weight_grad = inset.weight_grad.cuda()
    
parser = argparse.ArgumentParser(description='PFNN Neural Network Method')
parser.add_argument('--tr', type=int, default=5000,
                    help='train size')  
parser.add_argument('--prob', type=int, default=1,
                    help='problem idx')   
parser.add_argument('--wid', type=int, default=25,
                    help='layers width') 
parser.add_argument('--iter', type=int, default=1,
                    help='max_iter') 
parser.add_argument('--epochg', type=int, default=50,
                    help='netg epoch')  
parser.add_argument('--epochf', type=int, default=10,
                    help='netf epoch')                                                            
parser.add_argument('--lr', type=float, default=1e-3,
                    help='learning rate') 
parser.add_argument('--dim', type=int, default=2,
                    help='dimension')                    
                                     
dtype = torch.float64
args = parser.parse_args()
dim = args.dim
bound = np.zeros([dim,2])
for i in range(dim):
    bound[i,0] = -1
    bound[i,1] = 1
size_tr = args.tr
prob = args.prob
lay_wid = args.wid
max_iters = args.iter
epochg = args.epochg
epochf = args.epochf
lr = args.lr

bdset = BDSET(bound,dtype,prob,device)
inset = INSET(size_tr,bound,dtype,prob,device)
lenth = LEN(bound)

layerg = [dim,lay_wid,lay_wid,lay_wid,lay_wid,1];netg = Net(layerg,dtype).to(device)
layerf = [dim,lay_wid,lay_wid,lay_wid,lay_wid,1];netf = Net(layerf,dtype).to(device)
print(next(netg.parameters()).device)
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2,3,4,5,6,7,8'
netg = nn.DataParallel(netg)
netg.cuda()
netf = nn.DataParallel(netf)
netf.cuda()

print(bdset.X.device,bdset.u_acc.device,next(netg.parameters()).device)
#optimtype = 'BFGS'
#optimtype = 'LBFGS'
optimtype = 'adam'
if optimtype == 'BFGS':
    optimg = bfgs.BFGS(netg.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')
    optimf = bfgs.BFGS(netf.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')                  
elif optimtype == 'LBFGS':
    optimg = torch.optim.LBFGS(netg.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')
    optimf = torch.optim.LBFGS(netf.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')                  
else:
    optimg = torch.optim.Adam(netg.parameters(),lr=lr,betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
    optimf = torch.optim.Adam(netf.parameters(),lr=lr,betas=(0.9, 0.999), eps=1e-08, weight_decay=0) 
st = time.time()
for i in range(max_iters):
    Train(netg,netf,lenth,inset,bdset,optimtype,optimg,optimf,epochg,epochf)
ela = time.time() - st
print('finish,time:%.2f'%(ela))
#print(pred_u(netg,netf,lenth,bdset.X) - bdset.u_acc)

print(inset.X.shape,inset.L.shape,inset.Lx.shape,inset.G.shape,inset.Gx.shape,inset.ux.shape)



code:trainloader

import torch
from torch.utils.data import Dataset,DataLoader
import deepspeed
import time
import numpy as np
import matplotlib.pyplot as plt
import torch.nn as nn
from scipy.stats import qmc

import argparse

def UU(X, order,prob):
    if prob==1:
        temp = 10*(X[:,0]+X[:,1])**2 + (X[:,0]-X[:,1])**2 + 0.5
        if order[0]==0 and order[1]==0:
            return torch.log(temp)
        if order[0]==1 and order[1]==0:
            return temp**(-1) * (20*(X[:,0]+X[:,1]) + 2*(X[:,0]-X[:,1]))
        if order[0]==0 and order[1]==1:
            return temp**(-1) * (20*(X[:,0]+X[:,1]) - 2*(X[:,0]-X[:,1]))
        if order[0]==2 and order[1]==0:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])+2*(X[:,0]-X[:,1])) ** 2 \
                   + temp**(-1) * (22)
        if order[0]==1 and order[1]==1:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])+2*(X[:,0]-X[:,1])) \
                   * (20*(X[:,0]+X[:,1])-2*(X[:,0]-X[:,1])) \
                   + temp**(-1) * (18)
        if order[0]==0 and order[1]==2:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])-2*(X[:,0]-X[:,1])) ** 2 \
                   + temp**(-1) * (22)
    if prob==2:
        temp1 = X[:,0]*X[:,0] - X[:,1]*X[:,1]
        temp2 = X[:,0]*X[:,0] + X[:,1]*X[:,1] + 0.1
        if order[0]==0 and order[1]==0:
            return temp1 * temp2**(-1)
        if order[0]==1 and order[1]==0:
            return (2*X[:,0]) * temp2**(-1) + \
                   temp1 * (-1)*temp2**(-2) * (2*X[:,0])
        if order[0]==0 and order[1]==1:
            return (-2*X[:,1]) * temp2**(-1) + \
                   temp1 * (-1)*temp2**(-2) * (2*X[:,1])
        if order[0]==2 and order[1]==0:
            return (2) * temp2**(-1) + \
                   2 * (2*X[:,0]) * (-1)*temp2**(-2) * (2*X[:,0]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,0])**2 + \
                   temp1 * (-1)*temp2**(-2) * (2)
        if order[0]==1 and order[1]==1:
            return (2*X[:,0]) * (-1)*temp2**(-2) * (2*X[:,1]) + \
                   (-2*X[:,1]) * (-1)*temp2**(-2) * (2*X[:,0]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,0]) * (2*X[:,1])
        if order[0]==0 and order[1]==2:
            return (-2) * temp2**(-1) + \
                   2 * (-2*X[:,1]) * (-1)*temp2**(-2) * (2*X[:,1]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,1])**2 + \
                   temp1 * (-1)*temp2**(-2) * (2)
    if prob==3:
        temp = torch.exp(-4*X[:,1]*X[:,1])
        if order[0]==0 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * temp + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * temp
        if order[0]==1 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * (4*(X[:,0]+1)**3) * temp + \
                   (1-ind) * (4*(-X[:,0]+1)**3) * temp
        if order[0]==0 and order[1]==1:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * (temp*(-8*X[:,1])) + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * (temp*(-8*X[:,1]))
        if order[0]==2 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * (12*(X[:,0]+1)**2) * temp + \
                   (1-ind) * (-12*(-X[:,0]+1)**2) * temp
        if order[0]==1 and order[1]==1:
            ind = (X[:,0]<=0).float()
            return ind * (4*(X[:,0]+1)**3) * (temp*(-8*X[:,1])) + \
                   (1-ind) * (4*(-X[:,0]+1)**3) * (temp*(-8*X[:,1]))
        if order[0]==0 and order[1]==2:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * (temp*(64*X[:,1]*X[:,1]-8)) + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * (temp*(64*X[:,1]*X[:,1]-8))

def FF(X,prob):
    return -UU(X,[2,0],prob) - UU(X,[0,2],prob)

class INSETdata(Dataset):
    def __init__(self,size_tr,bound,dtype,prob):
        self.dim = bound.shape[0]
        self.size_tr = size_tr
        self.bound = bound
        self.dtype = dtype
        self.prob = prob
        
        self.quasi_samples()
        
    def quasi_samples(self):
        sampler = qmc.Sobol(d=self.dim)
        sample = sampler.random(n=self.size_tr)
        
        for i in range(self.dim):
            sample[:,i] = sample[:,i]*(self.bound[i,1] - self.bound[i,0]) + self.bound[i,0]
        
        self.X = torch.tensor(sample,dtype=self.dtype)
        
        self.u_acc = UU(self.X,[0,0],self.prob).data.reshape(-1,1)
        self.ff = FF(self.X,self.prob).data.reshape(-1,1)
        
    def __len__(self):
        return self.size_tr
    def __getitem__(self,index):
        xx = self.X[index]
        ff = self.ff[index]
        acc = self.u_acc[index]
        return xx,ff,acc
class BDSETdata(Dataset):
    def __init__(self,bound,nx,dtype,prob):
        self.dim = bound.shape[0]
        self.hx = [(bound[0,1] - bound[0,0])/nx[0],(bound[1,1] - bound[1,0])/nx[1]]
        self.size = 2*(nx[0] + nx[1])
        self.X = torch.zeros(self.size,self.dim)#储存内点
        m = 0
        for i in range(nx[0]):
            self.X[m,0] = bound[0,0] + (i + 0.5)*self.hx[0]
            self.X[m,1] = bound[1,0] 
            m = m + 1
        for j in range(nx[1]):
            self.X[m,0] = bound[0,1]
            self.X[m,1] = bound[1,0] + (j + 0.5)*self.hx[1]
            m = m + 1
        for i in range(nx[0]):
            self.X[m,0] = bound[0,0] + (i + 0.5)*self.hx[0]
            self.X[m,1] = bound[1,1] 
            m = m + 1
        for j in range(nx[1]):
            self.X[m,0] = bound[0,0]
            self.X[m,1] = bound[1,0] + (j + 0.5)*self.hx[1]
            m = m + 1
        self.X = self.X.type(dtype)
        self.u_acc = UU(self.X,[0,0],prob).reshape(-1,1)
    def __len__(self):
        return self.size
    def __getitem__(self,index):
        xx = self.X[index]
        acc = self.u_acc[index]
        return xx,acc
class LEN():
    def __init__(self,bound):
        self.bound = bound
        self.dim = bound.shape[0]
        self.hx = bound[:,1] - bound[:,0]
        self.mu = self.dim
    def forward(self,X):
        L = 1.0
        for i in range(self.dim):
            tmp1 = (X[:,i] - self.bound[i,0])/self.hx[i]
            tmp2 = (self.bound[i,1] - X[:,i])/self.hx[i]
            L = L*(1 - (1 - tmp1)**self.mu)*(1 - (1 - tmp2)**self.mu)
        return L.reshape(-1,1)
def L1_error(u_pred,u_acc):
    u_pred = u_pred
    u_acc = u_acc
    return max(abs(u_pred - u_acc))
np.random.seed(1234)
torch.manual_seed(1234)
class Net(torch.nn.Module):
    def __init__(self, layers, dtype):
        super(Net, self).__init__()
        self.dtype = dtype
        self.layers = layers
        self.layers_hid_num = len(self.layers)-2
        fc = []
        for i in range(self.layers_hid_num+1):
            fc.append(torch.nn.Linear(self.layers[i],self.layers[i+1]))
        self.fc = torch.nn.Sequential(*fc)
        for i in range(self.layers_hid_num+1):
            self.fc[i].weight.data = self.fc[i].weight.data.type(dtype)
            self.fc[i].bias.data = self.fc[i].bias.data.type(dtype)
    def forward(self, x):
        dev = x.device
        
        for i in range(self.layers_hid_num):
            h = torch.sin(self.fc[i](x))
            temp = torch.eye(x.shape[-1],self.layers[i+1],dtype = self.dtype,device = dev)
            x = h + x@temp
        return self.fc[-1](x) 
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])
def pred_u(netg,netf,lenth,X):
    return netf.forward(X)*lenth.forward(X) + netg.forward(X)
def Loss_bd(netg,x,acc):
    res_b = (netg.forward(x) - acc)**2
    return torch.sqrt(res_b + 1e-9).mean()
def Loss_in(netf,x,ff,L,Lx,G,Gx):
    insetF = netf.forward(x)
    insetFx, = torch.autograd.grad(insetF,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
    u = insetF*L + G
    ux = insetFx*L + insetF*Lx + Gx
    res_i = 0.5*((ux**2).sum(1).reshape(-1,1) - 2*ff*u).mean() 
    return res_i
def Traing(device,netg,bdset,batch,optimg,epochg,optimtype):
    netg = netg.to(device)
    model = nn.DataParallel(netg)
    
    for epoch in range(epochg):
        if optimtype == 'LBFGS' or optimtype == 'BFGS':
            for i,(xx,acc) in enumerate(bdset):
                xx = xx.to(device)
                acc = acc.to(device)
            
                def closure():
                    optimg.zero_grad()
                    loss = Loss_bd(model,xx,acc)
                    loss.backward()
                    return loss
                optimg.step(closure) 
        else:
            for i,(xx,acc) in enumerate(bdset):
                xx = xx.to(device)
                acc = acc.to(device)
                for _ in range(100):
                    optimg.zero_grad()
                    loss = Loss_bd(model,xx,acc)
                    loss.backward()
                    optimg.step()
            
        print('GPUs:%d,epoch:%d,loss at bound:%.3e\n'%(torch.cuda.device_count(),epoch,loss.item()))
def Trainf(device,netg,netf,lenth,inset,optimf,epochf,optimtype):
    netf = netf.to(device)
    model = nn.DataParallel(netf)
    
    for epoch in range(epochf):
        if optimtype == 'LBFGS' or optimtype == 'BFGS':
            for i,(x,ff,acc) in enumerate(inset):
                x = x.to(device)
                ff = ff.to(device)
                acc = acc.to(device)
                if x.requires_grad == False:
                    x.requires_grad = True
                L = lenth.forward(x)
                G = netg.forward(x)
                Lx, = torch.autograd.grad(L,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                Gx, = torch.autograd.grad(G,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                L = L.data;Lx = Lx.data;G = G.data;Gx = Gx.data                  
                def closure():
                    optimf.zero_grad()
                    loss = Loss_in(netf,x,ff,L,Lx,G,Gx)
                    loss.backward()
                    return loss
                optimf.step(closure) 
        else:
            for i,(x,ff,acc) in enumerate(inset):
                x = x.to(device)
                ff = ff.to(device)
                acc = acc.to(device)
                if x.requires_grad == False:
                    x.requires_grad = True
                L = lenth.forward(x)
                G = netg.forward(x)
                Lx, = torch.autograd.grad(L,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                Gx, = torch.autograd.grad(G,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                L = L.data;Lx = Lx.data;G = G.data;Gx = Gx.data
                for _ in range(100):
                    optimf.zero_grad()
                    loss = Loss_in(netf,x,ff,L,Lx,G,Gx)
                    loss.backward()
                    optimf.step()
        
        print('GPUs:%d,epoch:%d,loss inside:%.3e\n'%(torch.cuda.device_count(),epoch,loss.item()))
dtype = torch.float32
size_tr = 1000
prob = 1
nx = [50,50]
lay_wid = 15
bound = np.array([-1,1,-1,1]).reshape(2,2)
batch_in = 100
batch_bd = nx[0]
train_inset = INSETdata(size_tr,bound,dtype,prob)
train_bdset = BDSETdata(bound,nx,dtype,prob)
epochg = 20
epochf = 20
lr = 1e-3
inset = DataLoader(INSETdata(size_tr,bound,dtype,prob),batch_size = batch_in,shuffle = True)
bdset = DataLoader(BDSETdata(bound,nx,dtype,prob),batch_size = 25,shuffle = True)

lenth = LEN(bound)

layerg = [2,lay_wid,lay_wid,lay_wid,1];netg = Net(layerg,dtype)
layerf = [2,lay_wid,lay_wid,lay_wid,1];netf = Net(layerf,dtype)
#optimtype = 'LBFGS'
optimtype = 'adam'

if optimtype == 'LBFGS':
    optimg = torch.optim.LBFGS(netg.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')
    optimf = torch.optim.LBFGS(netf.parameters(),lr=lr, max_iter=100,
                      tolerance_grad=1e-16, tolerance_change=1e-16,
                      line_search_fn='strong_wolfe')                  
else:
    optimg = torch.optim.Adam(netg.parameters(),lr=lr,betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
    optimf = torch.optim.Adam(netf.parameters(),lr=lr,betas=(0.9, 0.999), eps=1e-08, weight_decay=0) 

st = time.time()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Traing(device,netg,bdset,batch_bd,optimg,epochg,optimtype)

Trainf(device,netg,netf,lenth,inset,optimf,epochf,optimtype)
netg = netg.to('cpu')
netf = netf.to('cpu')
u_acc = train_inset.u_acc
u_pre = pred_u(netg,netf,lenth,train_inset.X)
err = L1_error(u_acc,u_pre)
print('error:%.3e\n'%(err))
ela = time.time() - st
print('finsh, use time:%.2f\n'%(ela))

这个代码结果有问题

deepspeed代码

这个代码需要增加config.json文件,运行代码命令是
config.json

 {
   "train_micro_batch_size_per_gpu":30,
   "gradient_accumulation_steps":4,
   "steps_per_print": 2000,
   "optimizer": {
     "type": "Adam",
     "params": {
       "lr": 0.001,
       "betas": [
         0.9,
         0.999
       ],
       "eps": 1e-8,
       "weight_decay": 0
     }
   },
   "scheduler": {
     "type": "WarmupLR",
     "params": {
       "warmup_min_lr": 0,
       "warmup_max_lr": 0.001,
       "warmup_num_steps": 1000
     }
   },
   
   "wall_clock_breakdown": false
 }

xx.py

import torch
from torch.utils.data import Dataset,DataLoader
import deepspeed
import time
import numpy as np
import matplotlib.pyplot as plt
import torch.nn as nn
from scipy.stats import qmc

import argparse

def UU(X, order,prob):
    if prob==1:
        temp = 10*(X[:,0]+X[:,1])**2 + (X[:,0]-X[:,1])**2 + 0.5
        if order[0]==0 and order[1]==0:
            return torch.log(temp)
        if order[0]==1 and order[1]==0:
            return temp**(-1) * (20*(X[:,0]+X[:,1]) + 2*(X[:,0]-X[:,1]))
        if order[0]==0 and order[1]==1:
            return temp**(-1) * (20*(X[:,0]+X[:,1]) - 2*(X[:,0]-X[:,1]))
        if order[0]==2 and order[1]==0:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])+2*(X[:,0]-X[:,1])) ** 2 \
                   + temp**(-1) * (22)
        if order[0]==1 and order[1]==1:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])+2*(X[:,0]-X[:,1])) \
                   * (20*(X[:,0]+X[:,1])-2*(X[:,0]-X[:,1])) \
                   + temp**(-1) * (18)
        if order[0]==0 and order[1]==2:
            return - temp**(-2) * (20*(X[:,0]+X[:,1])-2*(X[:,0]-X[:,1])) ** 2 \
                   + temp**(-1) * (22)
    if prob==2:
        temp1 = X[:,0]*X[:,0] - X[:,1]*X[:,1]
        temp2 = X[:,0]*X[:,0] + X[:,1]*X[:,1] + 0.1
        if order[0]==0 and order[1]==0:
            return temp1 * temp2**(-1)
        if order[0]==1 and order[1]==0:
            return (2*X[:,0]) * temp2**(-1) + \
                   temp1 * (-1)*temp2**(-2) * (2*X[:,0])
        if order[0]==0 and order[1]==1:
            return (-2*X[:,1]) * temp2**(-1) + \
                   temp1 * (-1)*temp2**(-2) * (2*X[:,1])
        if order[0]==2 and order[1]==0:
            return (2) * temp2**(-1) + \
                   2 * (2*X[:,0]) * (-1)*temp2**(-2) * (2*X[:,0]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,0])**2 + \
                   temp1 * (-1)*temp2**(-2) * (2)
        if order[0]==1 and order[1]==1:
            return (2*X[:,0]) * (-1)*temp2**(-2) * (2*X[:,1]) + \
                   (-2*X[:,1]) * (-1)*temp2**(-2) * (2*X[:,0]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,0]) * (2*X[:,1])
        if order[0]==0 and order[1]==2:
            return (-2) * temp2**(-1) + \
                   2 * (-2*X[:,1]) * (-1)*temp2**(-2) * (2*X[:,1]) + \
                   temp1 * (2)*temp2**(-3) * (2*X[:,1])**2 + \
                   temp1 * (-1)*temp2**(-2) * (2)
    if prob==3:
        temp = torch.exp(-4*X[:,1]*X[:,1])
        if order[0]==0 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * temp + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * temp
        if order[0]==1 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * (4*(X[:,0]+1)**3) * temp + \
                   (1-ind) * (4*(-X[:,0]+1)**3) * temp
        if order[0]==0 and order[1]==1:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * (temp*(-8*X[:,1])) + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * (temp*(-8*X[:,1]))
        if order[0]==2 and order[1]==0:
            ind = (X[:,0]<=0).float()
            return ind * (12*(X[:,0]+1)**2) * temp + \
                   (1-ind) * (-12*(-X[:,0]+1)**2) * temp
        if order[0]==1 and order[1]==1:
            ind = (X[:,0]<=0).float()
            return ind * (4*(X[:,0]+1)**3) * (temp*(-8*X[:,1])) + \
                   (1-ind) * (4*(-X[:,0]+1)**3) * (temp*(-8*X[:,1]))
        if order[0]==0 and order[1]==2:
            ind = (X[:,0]<=0).float()
            return ind * ((X[:,0]+1)**4-1) * (temp*(64*X[:,1]*X[:,1]-8)) + \
                   (1-ind) * (-(-X[:,0]+1)**4+1) * (temp*(64*X[:,1]*X[:,1]-8))

def FF(X,prob):
    return -UU(X,[2,0],prob) - UU(X,[0,2],prob)

class INSETdata(Dataset):
    def __init__(self,size_tr,bound,dtype,prob):
        self.dim = bound.shape[0]
        self.size_tr = size_tr
        self.bound = bound
        self.dtype = dtype
        self.prob = prob
        
        self.quasi_samples()
        
    def quasi_samples(self):
        sampler = qmc.Sobol(d=self.dim)
        sample = sampler.random(n=self.size_tr)
        
        for i in range(self.dim):
            sample[:,i] = sample[:,i]*(self.bound[i,1] - self.bound[i,0]) + self.bound[i,0]
        
        self.X = torch.tensor(sample,dtype=self.dtype)
        
        self.u_acc = UU(self.X,[0,0],self.prob).data.reshape(-1,1)
        self.ff = FF(self.X,self.prob).data.reshape(-1,1)
        
    def __len__(self):
        return self.size_tr
    def __getitem__(self,index):
        xx = self.X[index]
        ff = self.ff[index]
        acc = self.u_acc[index]
        return xx,ff,acc
class BDSETdata(Dataset):
    def __init__(self,bound,nx,dtype,prob):
        self.dim = bound.shape[0]
        self.hx = [(bound[0,1] - bound[0,0])/nx[0],(bound[1,1] - bound[1,0])/nx[1]]
        self.size = 2*(nx[0] + nx[1])
        self.X = torch.zeros(self.size,self.dim)#储存内点
        m = 0
        for i in range(nx[0]):
            self.X[m,0] = bound[0,0] + (i + 0.5)*self.hx[0]
            self.X[m,1] = bound[1,0] 
            m = m + 1
        for j in range(nx[1]):
            self.X[m,0] = bound[0,1]
            self.X[m,1] = bound[1,0] + (j + 0.5)*self.hx[1]
            m = m + 1
        for i in range(nx[0]):
            self.X[m,0] = bound[0,0] + (i + 0.5)*self.hx[0]
            self.X[m,1] = bound[1,1] 
            m = m + 1
        for j in range(nx[1]):
            self.X[m,0] = bound[0,0]
            self.X[m,1] = bound[1,0] + (j + 0.5)*self.hx[1]
            m = m + 1
        self.X = self.X.type(dtype)
        self.u_acc = UU(self.X,[0,0],prob).reshape(-1,1)
    def __len__(self):
        return self.size
    def __getitem__(self,index):
        xx = self.X[index]
        acc = self.u_acc[index]
        return xx,acc
class LEN():
    def __init__(self,bound):
        self.bound = bound
        self.dim = bound.shape[0]
        self.hx = bound[:,1] - bound[:,0]
        self.mu = self.dim
    def forward(self,X):
        L = 1.0
        for i in range(self.dim):
            tmp1 = (X[:,i] - self.bound[i,0])/self.hx[i]
            tmp2 = (self.bound[i,1] - X[:,i])/self.hx[i]
            L = L*(1 - (1 - tmp1)**self.mu)*(1 - (1 - tmp2)**self.mu)
        return L.reshape(-1,1)
def L1_error(u_pred,u_acc):
    u_pred = u_pred
    u_acc = u_acc
    return max(abs(u_pred - u_acc))
np.random.seed(1234)
torch.manual_seed(1234)
class Net(torch.nn.Module):
    def __init__(self, layers, dtype):
        super(Net, self).__init__()
        self.dtype = dtype
        self.layers = layers
        self.layers_hid_num = len(self.layers)-2
        fc = []
        for i in range(self.layers_hid_num+1):
            fc.append(torch.nn.Linear(self.layers[i],self.layers[i+1]))
        self.fc = torch.nn.Sequential(*fc)
        for i in range(self.layers_hid_num+1):
            self.fc[i].weight.data = self.fc[i].weight.data.type(dtype)
            self.fc[i].bias.data = self.fc[i].bias.data.type(dtype)
    def forward(self, x):
        dev = x.device
        
        for i in range(self.layers_hid_num):
            h = torch.sin(self.fc[i](x))
            temp = torch.eye(x.shape[-1],self.layers[i+1],dtype = self.dtype,device = dev)
            x = h + x@temp
        return self.fc[-1](x) 
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])
def pred_u(netg,netf,lenth,X):
    return netf.forward(X)*lenth.forward(X) + netg.forward(X)
def Loss_bd(netg,x,acc):
    res_b = (netg.forward(x) - acc)**2
    
    return torch.sqrt(res_b + 1e-9).mean()
def Loss_in(netf,x,ff,L,Lx,G,Gx):
    if x.requires_grad == False:
        x.requires_grad = True
    insetF = netf.forward(x)
    insetFx, = torch.autograd.grad(insetF,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
    u = insetF*L + G
    ux = insetFx*L + insetF*Lx + Gx
    res_i = 0.5*((ux**2).sum(1).reshape(-1,1) - 2*ff*u).mean() 
    return res_i

dtype = torch.float32
size_tr = 1200
prob = 1
nx = [60,60]
lay_wid = 15
bound = np.array([-1,1,-1,1]).reshape(2,2)
batch_in = 100
batch_bd = nx[0]
inset = INSETdata(size_tr,bound,dtype,prob)
bdset = BDSETdata(bound,nx,dtype,prob)
epochg = 30
epochf = 20

lenth = LEN(bound)

layerg = [2,lay_wid,lay_wid,lay_wid,1];netg = Net(layerg,dtype)
layerf = [2,lay_wid,lay_wid,lay_wid,1];netf = Net(layerf,dtype)
def add_argument():
    parser = argparse.ArgumentParser(description='PFNN deepspeed')
    
    parser.add_argument('--local_rank',
                        type=int,
                        default=-1,
                        help='local rank passed from distributed launcher')
    parser = deepspeed.add_config_arguments(parser)
    args = parser.parse_args()
    return args
#%%
args = add_argument()

model_engine, optimizer, trainloader, __ = deepspeed.initialize(
    args=args, model=netg, model_parameters=netg.parameters(), training_data=bdset)
for epoch in range(epochg):
    for _ in range(100):
        for i, data in enumerate(trainloader):
            x,acc = data[0].to(model_engine.local_rank), data[1].to(model_engine.local_rank)
        
            loss = Loss_bd(model_engine,x,acc)
            model_engine.backward(loss)
            model_engine.step()
        
    print("GPUs:%d:epoch:%d,loss at bound:%.3e,device:%d\n"%(torch.cuda.device_count(),epoch,loss.item(),model_engine.local_rank))


with torch.no_grad():
    netg = netg.to('cpu')
    loss = ((netg.forward(bdset.X.to('cpu')) - bdset.u_acc.to('cpu'))**2).mean()
    print("loss at bound:%.3e\n"%(loss.item()))
def Trainf(args,netg,netf,inset,epochf):
    model_engine, optimizer, trainloader, __ = deepspeed.initialize(
    args=args, model=netf, model_parameters=netf.parameters(), training_data=inset)
    for epoch in range(epochf):
        for _ in range(100):
            for i, data in enumerate(trainloader):
                x,ff,u_acc = data[0].to(model_engine.local_rank), data[1].to(model_engine.local_rank), data[2].to(model_engine.local_rank)
                if x.requires_grad == False:
                    x.requires_grad = True
                L = lenth.forward(x)
                netg = netg.to(model_engine.local_rank);G = netg.forward(x)
                Lx, = torch.autograd.grad(L,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                Gx, = torch.autograd.grad(G,x,create_graph = True,retain_graph = True,
                                  grad_outputs = torch.ones_like(x[:,0:1]).to(x.device))
                loss = Loss_in(model_engine,x,ff,L,Lx,G,Gx)
                model_engine.backward(loss)
                model_engine.step()
        u_pred = pred_u(netg,netf,lenth,x)
        u_acc = u_acc.reshape(-1,1)
        print("GPUs:%d,epoch:%d,:loss at inside:%.3e,device:%d,error:%.3e\n"%(torch.cuda.device_count(),epoch,loss.item(),model_engine.local_rank,L1_error(u_pred,u_acc)))
Trainf(args,netg,netf,inset,epochf)

deepspeed xx.py --deepspeed_config config.json

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

谨慎付费(看不懂试读博客不要订阅)

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值