近在docker容器中部署使用英伟达GPU的应用出现以下报错:
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
原因为:
nvidia-container-toolkit 未安装
ubuntu环境下的解决方法如下:
1、更新apt 仓库
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
2、更新仓库
sudo apt-get update
3、安装
sudo apt-get install -y nvidia-container-toolkit
4、验证
which nvidia-container-runtime
输出 /usr/bin/nvidia-container-runtime,表示安装成功。
报错消失
其他环境参考