CUDA unknown error

CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /opt/conda/conda-bld/pytorch_1702400366987/work/c10/cuda/CUDAFunctions.cpp:108.)

RuntimeError: CUDA error: CUDA-capable device(s) is/are busy or unavailable
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

环境ubuntu 22.04系统,显卡驱动Driver Version: 535.171.04   CUDA Version: 12.2  ,cuda-toolkit版本520.61.05,为cuda 11.8,安装cuda 后,使用torch.cuda.is_available() 报上述错误,按照参考资料1的提示,安装了sudo apt install nvidia-modprobe,问题依旧,这个问题是系统刚开机时好用,过一段时间后就会报错

最后按照参考资料4里彻底卸载cuda然后重新安装cuda,再从参考资料2里下载cuda-samples,按照参考资料3的提示进行安装完毕后的测试,发现测试通过,但后续还会出现这样问题,即刚开机好用,过一段时间不用就又不好用了

后来发现是系统休眠、显示器黑屏的原因(显示是用的独立显卡进行显示的)

systemctl status sleep.target

○ sleep.target - Sleep
     Loaded: loaded (/lib/systemd/system/sleep.target; static)
     Active: inactive (dead) since Wed 2024-10-30 17:16:13 CST; 3min 10s ago
       Docs: man:systemd.special(7)

Loaded 状态是loaded,需要改成masked

systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target
Created symlink /etc/systemd/system/sleep.target → /dev/null.
Created symlink /etc/systemd/system/suspend.target → /dev/null.
Created symlink /etc/systemd/system/hibernate.target → /dev/null.
Created symlink /etc/systemd/system/hybrid-sleep.target → /dev/null.
 

重启,再次确认状态systemctl status sleep.target

○ sleep.target
     Loaded: masked (Reason: Unit sleep.target is masked.)
     Active: inactive (dead)

已是masked 状态,同时在设置-电源 选项里,设置显示器关闭为从不、不要挂起等,发现问题已不再出现,至此已 解决问题。

参考资料:

【成功解决,非重启】RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environme...-优快云博客

https://github.com/NVIDIA/cuda-samples/releases

3CUDA学习3——samples下载与编译运行(优快云_0036_20231126)_cuda samples-优快云博客

4ubuntu纯净卸载CUDA+cudnn【全网最全】_ubuntu卸载cuda-优快云博客

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值