copy.deepcopy(train_model)时报错:Only Tensors created explicitly by the user support the deepcopy

在PyTorch模型训练过程中,使用`copy.deepcopy()`对模型进行拷贝时遇到RuntimeError,原因是不支持非用户显式创建的Tensors。问题定位到模型子模块返回了self.features,修改为返回临时变量features解决了问题。修改前后的代码示例展示了这一变化。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

错误信息:

RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

可能的原因:

模型训练过程中常需边训练边做validation,通常使用copy.deepcopy()直接深度拷贝训练中的model用来做validation是比较简洁的写法,如在我的validation.py中,会用到:

 val_model = copy.deepcopy(train_model)

但是由于copy.deepcopy()的限制,调用copy.deepcopy(model)时可能就会遇到这个错误:Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment,详细错误信息如下:

  File "/home/users/xinxin.li/HAT-dev-toolchain/hat/engine/ddp_trainer.py", line 359, in _with_exception
    fn(*args)
  File "/home/users/xinxin.li/HAT-dev-toolchain/tools/train.py", line 186, in train_entrance
    trainer.fit()
  File "/home/users/xinxin.li/HAT-dev-toolchain/hat/engine/loop_base.py", line 523, in fit
    storage=self.storage,
  File "/home/users/xinxin.li/HAT-dev-toolchain/hat/engine/loop_base.py", line 73, in on_epoch_end
    cb.on_epoch_end(**kwargs)
  File "/home/users/xinxin.li/HAT-dev-toolchain/hat/callbacks/validation.py", line 207, in on_epoch_end
    self._do_val(epoch_id, model, ema_model, device, val_metrics)
  File "/home/users/xinxin.li/HAT-dev-toolchain/hat/callbacks/validation.py", line 163, in _do_val
    val_model = self._select_and_init_val_model(train_model=eval_model)
  File "/home/users/xinxin.li/HAT-dev-toolchain/hat/callbacks/validation.py", line 147, in _select_and_init_val_model
    val_model = copy.deepcopy(train_model)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 306, in _reconstruct
    value = deepcopy(value, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 306, in _reconstruct
    value = deepcopy(value, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py", line 161, in deepcopy
    y = copier(memo)
  File "/home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/site-packages/torch/_tensor.py", line 85, in __deepcopy__
    raise RuntimeError("Only Tensors created explicitly by the user "
RuntimeError: Only Tensors created explicitly by the user (graph leaves) support the deepcopy protocol at the moment

如何排查:

1. 进入 /home/users/xinxin.li/anaconda3/envs/python36/lib/python3.6/copy.py ,给下面位置打断点,并输出对应的 key 和 value

 2. 重新运行程序,定位报错的前一行的网络对应原模型的哪一行,找到你网络结构对应的位置,就是这个地方的报错

 我的问题定位:

因为我的模型子模块在构建时返回了 self.features,导致了这个错误,我修改返回临时变量后,这个错误解决了。

修改前的代码:

    def forward(self, input_image):
        self.features = []
        x = (input_image - 0.45) / 0.225
        x = self.encoder.conv1(x)
        x = self.encoder.bn1(x)
        self.features.append(self.encoder.relu(x))
        self.features.append(self.encoder.layer1(self.encoder.maxpool(self.features[-1])))
        self.features.append(self.encoder.layer2(self.features[-1]))
        self.features.append(self.encoder.layer3(self.features[-1]))
        self.features.append(self.encoder.layer4(self.features[-1]))

        return self.features

 修改后的代码:

    def forward(self, input_image):
        features = []
        x = (input_image - 0.45) / 0.225
        x = self.encoder.conv1(x)
        x = self.encoder.bn1(x)
        features.append(self.encoder.relu(x))
        features.append(self.encoder.layer1(self.encoder.maxpool(features[-1])))
        features.append(self.encoder.layer2(features[-1]))
        features.append(self.encoder.layer3(features[-1]))
        features.append(self.encoder.layer4(features[-1]))

        return features

参考链接:(138条消息) 解决使用copy.deepcopy()拷贝Tensor或model时报错只支持用户显式创建的Tensor问题_Arnold-FY-Chen的博客-优快云博客_copy tensor

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值