RuntimeError: one of the variables needed for gradient computation has been modified by an inplace

转载已于 2022-09-05 15:41:53 修改 · 1.3k 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://zhuanlan.zhihu.com/p/406823590

文章标签：

#深度学习 #机器学习 #python

于 2022-09-05 15:40:06 首次发布

pytorch 专栏收录该内容

1 篇文章

订阅专栏

部署运行你感兴趣的模型镜像

关于torch的with torch.no_grad()用法：

几行代码让你搞懂torch.no_grad - 知乎

leaf variable & with torch.no_grad & -=_爱编程的喵喵的博客-优快云博客

Python torch.no_grad用法及代码示例 - 纯净天空

RuntimeError: one of the variables needed for gradient computation
has been modified by an inplace operation: [torch.FloatTensor [2]]
is at version 1; expected version 0 instead.
解释：b = 0.5*a*a的梯度与变量a直接相关，如果是b = a*2，梯度与变量a无关则不回会报错

import torch
torch.autograd.set_detect_anomaly(True)
a = torch.ones(2,requires_grad=True)
b = 0.5*a*a
print(a,a.grad,a.requires_grad)
b.sum().backward(retain_graph = True)
print(a,a.grad,a.requires_grad)  # tensor([1., 1.])
with torch.no_grad():
    a += a.grad
    print(a,a.grad,a.requires_grad) # tensor([1., 1.])，打印缓存的梯度
#b.sum().backward(retain_graph = True)
'''
RuntimeError: one of the variables needed for gradient computation 
has been modified by an inplace operation: [torch.FloatTensor [2]] 
is at version 1; expected version 0 instead.
解释：b = 0.5*a*a的梯度与变量a直接相关，如果是b = a*2，梯度与变量a无关则不回会报错
'''
b = 0.5*a*a
b.sum().backward(retain_graph = True)
print(a,a.grad,a.requires_grad)  # tensor([3., 3.])，梯度累加了
b.sum().backward(retain_graph = True)
print(a,a.grad,a.requires_grad)  # tensor([5., 5.])，继续累加
a.grad.zero_()                   # 梯度清零
b.sum().backward(retain_graph = True)
print(a,a.grad,a.requires_grad)  # tensor([2., 2.])

输出：
tensor([1., 1.], requires_grad=True) None True
tensor([1., 1.], requires_grad=True) tensor([1., 1.]) True
tensor([2., 2.], requires_grad=True) tensor([1., 1.]) True
tensor([2., 2.], requires_grad=True) tensor([3., 3.]) True
tensor([2., 2.], requires_grad=True) tensor([5., 5.]) True
tensor([2., 2.], requires_grad=True) tensor([2., 2.]) True

torch.autograd.set_detect_anomaly(True)
a = torch.ones(2,requires_grad=True)
b = 0.5*a
print(a,a.grad,a.requires_grad)
b.sum().backward(retain_graph = True)
print(a,a.grad,a.requires_grad)  # tensor([0.5000, 0.5000])
with torch.no_grad():
    a += a.grad
    print(a,a.grad,a.requires_grad) # tensor([0.5000, 0.5000])，打印缓存的梯度
b.sum().backward(retain_graph = True)
print(a,a.grad,a.requires_grad)  # tensor([1., 1.])，梯度累加

输出：

tensor([1., 1.], requires_grad=True) None True
tensor([1., 1.], requires_grad=True) tensor([0.5000, 0.5000]) True
tensor([1.5000, 1.5000], requires_grad=True) tensor([0.5000, 0.5000]) True
tensor([1.5000, 1.5000], requires_grad=True) tensor([1., 1.]) True

您可能感兴趣的与本文相关的镜像