torch.tensor 变量数值更改

博客主要讲述代码运行时出现的错误,原因是张量的梯度为 True,直接修改数据会报错,给出了正确的修改方式及输出结果,并说明了背后原因,还提供了参考文章链接。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

在运行代码时出现以下错误:

a view of a leaf Variable that requires grad is being used in an in-place operation.

原因是张量的 梯度 为 True:

import torch

a = torch.ones(2, 2, requires_grad=True)
b = torch.ones(2, 2, requires_grad=True)
c = a * b

如果直接修改数据就会报错,所以正确的修改方式应该是:

a.data[0] = torch.tensor([2,1])

print("a", a)
print("c", c)

得到输出结果为:

a tensor([[2., 1.],
        [1., 1.]], requires_grad=True)
c tensor([[1., 1.],
        [1., 1.]], grad_fn=<MulBackward0>)

这是因为 

a.data 的 requires_grad= False ,所以直接更改不会影响张量 a 的梯度选项

参考文章链接:链接

“ def update(self, states, actions, rewards, next_states, dones): values = [] next_values = [] for state in states: _, value = self.actor_critic(state) values.append(value.item()) for next_state in next_states: _, next_value = self.actor_critic(next_state) next_values.append(next_value.item()) advantages = self.compute_advantages(rewards, values, next_values, dones) # Convert to tensors states = torch.tensor(states, dtype=torch.float32) actions = torch.tensor(actions, dtype=torch.long) advantages = torch.tensor(advantages, dtype=torch.float32) rewards = torch.tensor(rewards, dtype=torch.float32) action_probs, state_values = self.actor_critic(states) # Compute the ratio between new and old probabilities old_action_probs = action_probs.gather(1, actions.unsqueeze(-1)) ratio = old_action_probs / action_probs.gather(1, actions.unsqueeze(-1)) # Compute surrogate loss surrogate_loss = torch.min(ratio * advantages, torch.clamp(ratio, 1 - self.eps, 1 + self.eps) * advantages) policy_loss = -torch.mean(surrogate_loss) # Compute value loss value_loss = F.mse_loss(state_values.squeeze(), rewards) # Compute entropy loss entropy_loss = -torch.mean(action_probs * torch.log(action_probs)) # Total loss loss = policy_loss + self.value_coeff * value_loss + self.entropy_coeff * entropy_loss # Update parameters self.optimizer.zero_grad() loss.backward() self.optimizer.step()”Traceback (most recent call last): File "E:\pycharm\lianxi\pythonProject\RL2.py", line 2197, in <module> ppo.update(states, actions, rewards, next_states, dones) File "E:\pycharm\lianxi\pythonProject\RL2.py", line 1903, in update states = torch.tensor(states, dtype=torch.float32) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: only one element tensors can be converted to Python scalars
最新发布
03-25
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值