nvidia-smi报错:NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
参考:电脑重启后,nvidia-smi 指令错误,找不到显卡驱动
重新生成驱动
sudo apt-add-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install dkms
cd /usr/src
ls
可以看到一个文件夹以显卡驱动版本号命名,若没有,则显卡驱动没装上,需要重新装显卡驱动
sudo dkms install -m nvidia -v 470.57.02
dkms这一步可能遇到报错:
ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-kernel-source-xxx-server.0.crash'
Error! Bad return status for module build on kernel: x.x.x-xx-generic (x86_64)
Consult /var/lib/dkms/nvidia/<usr_name>/build/make.log for more information.
打开此文件查看报错信息
sudo gedit /var/lib/dkms/nvidia/<usr_name>/build/make.log
发现是ubuntu编译错误:
You are building kernel with non-retpoline compiler, please update your compiler
原因是kernel版本与
gcc不匹配,用
uname -r命令检查内核版本,用gcc -v检查
gcc版本,有必要时卸载gcc重装
(
比较危险,容易把系统搞崩
),直到gcc -v 能
正确显示结果(对于ubuntu1804,最好是7.3)
如果这样做还没修复,或者在/usr/src 里发现显卡驱动没了,可以选择卸载所有显卡驱动并重装
sudo apt-get remove --purge nvidia*
sudo ubuntu-drivers autoinstall
autoinstall会自动安装推荐的驱动版本。当然你也可以手动安装别的版本,使用ubuntu-drivers devices查看支持的驱动,把前面sudo ubuntu-drivers autoinstall换为sudo apt install nvidia-driver-xxx即可