CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
首先
sudo lsof /dev/nvidia-uvm
发现
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 51120 xx 10u CHR 505,0 0t0 614 /dev/nvidia-uvm
python 51120 xx 18u CHR 505,0 0t0 614 /dev/nvidia-uvm
python 51259 xx 10u CHR 505,0 0t0 614 /dev/nvidia-uvm
python 51259 xx 18u CHR 505,0 0t0 614 /dev/nvidia-uvm
然后
sudo kill -9 51120 51259
之后再
sudo rmmod nvidia_uvm
就解决了