接上一篇spmm之后针对edge_softmax的实验:
三者的结果上实验
注意:当两个结点之间有多条边时,dgl和torch.sparse中的softmax上是会有点区别的,torch构建的稀疏矩阵会直接累加两个结点之间的边上的值,dgl则会分别对待,可以参考:https://github.com/dmlc/dgl/issues/2311。
import torch
import numpy as np
import torch_sparse
import dgl
import torch_geometric.utils
import time
start_time = time.time()
n = 60000
nnz = 50000
# nnz = 50000000
np.random.seed(123)
torch.manual_seed(123)
rows = np.random.randint(0, n, nnz)
cols = np.random.randint(0, n, nnz)
values = torch.randn(nnz).cuda().requires_grad_(True)
# torch.sparse.softmax
X_sparse = torch.sparse_coo_tensor([rows, cols], values, size=(n, n)).cuda().requires_grad_(True)
# 注意,这里需要是cols在前,rows在后,计算出来的结果才是一样的
g = dgl.graph((cols, rows))
g = g.to("cuda:0")
g.edata['e'] = values
print("memory allocated before multi: {} GB".format(torch.cuda.memory_allocated() / 10 ** 9))
print("max memory allocated before multi: {} GB".format(torch.cuda.max_memory_allocated() / 10 ** 9))
# t = torch_sparse.spmm(torch.tensor([rows, cols], dtype=torch.long).cuda(), values, n, n, Y_dense).sum()
torch.cuda.synchronize()
start_time = time.time()
# 需要注意softmax之后得到的稀疏矩阵边的顺序会改变
a_sparse = torch.sparse.softmax(X_sparse, dim=1)
t1 = torch.sparse.sum(a_sparse)
# 参数分别为values, index, xx, 结点数量
b_sparse = torch_geometric.utils.softmax(values, torch.LongTensor(rows).cuda(), None, n)
t2 = b_sparse.sum()
c_sparse = dgl.ops.edge_softmax(g, values)
t3 = c_sparse.sum()
print("memory allocated before backward: {} GB".format(torch.cuda.memory_allocated() / 10 ** 9))
print("max memory allocated before backward: {} GB".format(torch.cuda.max_memory_allocated() / 10 ** 9))
t1.backward()
t2.backward()
t3.backward()
print("t1: {}".format(t1))
print("t2: {}".format(t2))
print("t3: {}".format(t3))
print("memory allocated after backward: {} GB".format(torch.cuda.memory_allocated() / 10 ** 9))
print("max memory allocated after backward: {} GB".format(torch.cuda.max_memory_allocated() / 10 ** 9))
torch.cuda.synchronize()
print("spmm and backward time is {} s".format(time.time() - start_time))
使用torch原生:
Using backend: pytorch
memory allocated before softmax: 0.001000448 GB
max memory allocated before softmax: 0.001001984 GB
memory allocated before backward: 0.003001856 GB
max memory allocated before backward: 0.004612608 GB
t1: 34127.0
memory allocated after backward: 0.0022016 GB
max memory allocated after backward: 14.40420352 GB
softmax and backward time is 0.08920717239379883 s
使用pyG:
Using backend: pytorch
memory allocated before softmax: 0.000200192 GB
max memory allocated before softmax: 0.000200192 GB
memory allocated before backward: 0.001202176 GB
max memory allocated before backward: 0.0012416 GB
t2: 34127.0
memory allocated after backward: 0.000601088 GB
max memory allocated after backward: 0.003644928 GB
softmax and backward time is 0.016681909561157227 s
使用DGL:
Using backend: pytorch
memory allocated before softmax: 0.000200192 GB
max memory allocated before softmax: 0.000200192 GB
memory allocated before backward: 0.000400896 GB
max memory allocated before backward: 0.001080832 GB
t3: 34127.0
memory allocated after backward: 0.000601088 GB
max memory allocated after backward: 0.001242112 GB
softmax and backward time is 0.007536411285400391 s
使用g.create_formats_()
提前创建csc格式后:
Using backend: pytorch
memory allocated before softmax: 0.000200192 GB
max memory allocated before softmax: 0.000200192 GB
memory allocated before backward: 0.000400896 GB
max memory allocated before backward: 0.001080832 GB
t3: 34127.0
memory allocated after backward: 0.000601088 GB
max memory allocated after backward: 0.001242112 GB
softmax and backward time is 0.004187822341918945 s
把边的条数扩大到50000000之后。
torch原生:
Using backend: pytorch
memory allocated before softmax: 1.0 GB
max memory allocated before softmax: 1.000014336 GB
memory allocated before backward: 2.986202112 GB
max memory allocated before backward: 4.179481088 GB
Traceback (most recent call last):
File "/home/maqy/gnn/ginn_batch_compare/GINN-1130/memory_test.py", line 50, in <module>
t1.backward()
File "/root/miniconda3/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/root/miniconda3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 13.41 GiB (GPU 0; 14.76 GiB total capacity; 2.78 GiB already allocated; 10.65 GiB free; 3.17 GiB reserved in total by PyTorch)
Process finished with exit code 1
使用pyG:
Using backend: pytorch
memory allocated before softmax: 0.2 GB
max memory allocated before softmax: 0.2 GB
memory allocated before backward: 1.200557056 GB
max memory allocated before backward: 1.200559104 GB
t2: 60000.0
memory allocated after backward: 0.600000512 GB
max memory allocated after backward: 3.403577856 GB
softmax and backward time is 0.49458837509155273 s
使用DGL:
Using backend: pytorch
memory allocated before softmax: 0.2 GB
max memory allocated before softmax: 0.2 GB
memory allocated before backward: 0.400000512 GB
max memory allocated before backward: 0.600480256 GB
t3: 60000.0
memory allocated after backward: 0.600000512 GB
max memory allocated after backward: 1.000241152 GB
softmax and backward time is 0.32391786575317383 s
使用DGL,并事先构造出csc格式:
Using backend: pytorch
memory allocated before softmax: 0.2 GB
max memory allocated before softmax: 0.2 GB
memory allocated before backward: 0.400000512 GB
max memory allocated before backward: 0.600480256 GB
t3: 60000.0
memory allocated after backward: 0.600000512 GB
max memory allocated after backward: 1.000241152 GB
softmax and backward time is 0.16484856605529785 s