解决RuntimeError: one of the variables needed for gradient computation has been modified by an inplace

最新推荐文章于 2025-06-03 17:13:23 发布

Leon嘞

最新推荐文章于 2025-06-03 17:13:23 发布

阅读量649

点赞数 3

CC 4.0 BY-SA版权

分类专栏： python学习代码学习文章标签：深度学习人工智能

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.youkuaiyun.com/qq_43820692/article/details/147255481

python学习同时被 2 个专栏收录

2 篇文章

订阅专栏

2 篇文章

订阅专栏

解决 RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

在将代码从单机单卡修改为单机多卡训练时，突然冒出了这样的问题：

RuntimeError: one of the variables needed for gradient computation has
been modified by an inplace operation: [torch.cuda.FloatTensor [2500,
256]], which is output 0 of AsStridedBackward0, is at version 1;
expected version 0 instead. Hint: the backtrace further above shows
the operation that failed to compute its gradient. The variable in
question was changed in there or anywhere later. Good luck!

从网上搜了很多方法，总结一下就是代码中进行了inplace运算，而这再多卡训练中是不允许的，有的解决方法是：
x += y修改为x = x + y
nn.ReLU(inplace=True)修改为 nn.ReLU(inplace=False)等
但对我都不管用！！
最终我在train模块之前加入这行代码：

torch.autograd.set_detect_anomaly(True)

在加入后报错代码中就会打印出具体的报错行，而不是粗略的loss.backward()中出错。
在获知具体的报错行后，查看此处的变量，在变量后面加入.clone()即可，如我的报错行是：

 k = self.k(key).reshape(B, L,

那我将其修改为

 k = self.k(key.clone()).reshape(B, L,

问题解决！

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。