报错:
UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 500: named symbol not found (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
查看解决这个问题的链接:
https://github.com/NVIDIA/nvidia-container-toolkit/issues/520
这篇文章的翻译如下:
适用于 Windows 的 NVIDIA 驱动程序 555.xx 及更新版本添加了一个名为的库libnvdxgdmal.so.1,必须将其映射到容器中,CUDA 才能继续在 WSL2 下的容器中工作。
必须更新 nvidia-container-toolkit 才能添加对这个新库的引用。
如果由于 nvidia-container-toolkit 未更新且使用了 555.xx 或更新的驱动程序而导致容器中缺少该库,则 CUDA 初始化将返回错误 500“未找到命名符号”(CUDA_ERROR_NOT_FOUND)。
- 如果您在 WSL2 下的 Linux 上使用 Docker CE 时看到此症状,请将您的 nvidia-container-toolkit 更新到 1.14.4 或更新版本。
安装nvidia-container-toolkit包教程:
https://blog.youkuaiyun.com/qq_50247813/article/details/145615120
总之,安装或更新nvidia-container-toolkit就对了