多版本CUDA共存并可手动切换

本文档详细记录了在Linux环境下如何安装并切换CUDA版本,从9.0到10.1,包括runfile安装步骤、软链接管理、环境变量设置及遇到的库文件问题及其解决方法。在切换CUDA版本后,还解决了TensorRT依赖问题,确保了程序正常运行。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

由于使用paddle,不同版本匹配不同cuda,为了以后的可能使用,不删除旧版本cuda9.0,同时安装新版本cuda10.1。
首先去官网下载cuda10.1的runfile。

使用起来非常简单,
sudo sh cuda_10.1.105_418.39_linux.run
开头会提示,有已存在的软链接在/usr/local/cuda上,这里选择yes强制继续。但是这个软链接并没有被删除,依旧存在,可用 nvcc -V ,查看当前cuda版本,仍是cuda9.0,继续下一步。

第二次选择时,如果原本的显卡驱动版本高于cuda10.1对应的驱动版本418,记得 按回车 去掉驱动的下载,仅下载带cuda10.1字样的相关内容。
最终命令框内出现如下内容 ,即安装完成。

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.1/
Samples:  Installed in /home/meroke/, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-10.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.1/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.1/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 418.00 is required for CUDA 10.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

此时请确认,在 ~/.bashrc 的末尾,cuda的PATH一定要使用如下格式,:/usr/local/cuda,不要使用/usr/local/cuda10.1或/usr/local/cuda9.0,这些带有具体版号的地址是没必要的。使用/usr/local/cuda这个软链接地址,可以更方便的切换cuda版本。

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda

安装完后,在/usr/local/内已经存在cuda10.1
接着使用

cd /usr/local/
sudo rm -rf cuda # 删除原有链接
sudo ln -s cuda-10.1 cuda  #新建链接

此时再用nvcc -V测试即可显示cuda10.1

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

同理如果想切换回cuda9.0, 重复操作即可

cd /usr/local/
sudo rm -rf cuda # 删除原有链接
sudo ln -s cuda-9.0 cuda  #新建链接

记录一个最后发现的小问题

运行SDK内的demo.py 仍然报错

(torch) meroke@meroke-W650KJ1-KK1:~/Develop_Tool/EasyEdge/python$ python demo.py ../RES /home/meroke/Pictures/pic.png
TensorRT dynamic library (libnvinfer.so) that Paddle depends on is not configured correctly. (error code is libcudnn.so.7: cannot open shared object file: No such file or directory)
  Suggestions:
  1. Check if TensorRT is installed correctly and its version is matched with paddlepaddle you installed.
  2. Configure TensorRT dynamic library environment variables as follows:
  - Linux: set LD_LIBRARY_PATH by `export LD_LIBRARY_PATH=...`
  - Windows: set PATH by `set PATH=XXX;2021-08-12 06:52:46 INFO [EasyEdge] [demo.py:42] 140536971970368: Init paddlefluid engine...
2021-08-12 06:52:46 DEBUG [EasyEdge] [auth.cpp:109] 140536971970368: Local license is ok.

这里看到是libcudnn.so.7的问题,但是我确定这个是存在我cuda的文件夹里的,~/.bashrc 内的LD_LIBRARY_PATH也是正确无误的。那么问题应该就在这个链接本身,打开属性一看,居然是链接断开的状态。那很明显了,要么是cuda9.0的链接没被替换掉,要么就是链接有问题。反正解决很简单

cd /usr/local/cuda/lib64
sudo rm  libcudnn.so.7
sudo ln -s libcudnn.so.7.6.5 libcudnn.so.7

问题解决,demo.py 运行结果如下:

(torch) meroke@meroke-W650KJ1-KK1:~/Develop_Tool/EasyEdge/python$ python demo.py ../RES /home/meroke/Pictures/7.png
2021-08-12 07:02:47 INFO [EasyEdge] [demo.py:42] 140638167918400: Init paddlefluid engine...
2021-08-12 07:02:47 DEBUG [EasyEdge] [auth.cpp:109] 140638167918400: Local license is ok.
{'index': 3, 'confidence': 0.9268730282783508, 'label': 'can', 'x1': 0.15698016317267166, 'y1': 0.609623156095806, 'x2': 0.6120336432206003, 'y2': 0.9052477384868421}
{'index': 5, 'confidence': 0.9135696291923523, 'label': 'battary', 'x1': 0.17566856585050883, 'y1': 0.46550364243356807, 'x2': 0.4247710077386153, 'y2': 0.6359258450959858}
{'index': 5, 'confidence': 0.8809165954589844, 'label': 'battary', 'x1': 0.45790566896137436, 'y1': 0.31615176953767476, 'x2': 0.6696055060938785, 'y2': 0.5543973822342722}
{'index': 4, 'confidence': 0.8791264891624451, 'label': 'bottle', 'x1': 0.12966020483719676, 'y1': 0.02947319181341874, 'x2': 0.7486397090711092, 'y2': 0.41465071627968236}
{'index': 3, 'confidence': 0.4161497950553894, 'label': 'can', 'x1': 0.8082389329609118, 'y1': 0.4525331697965923, 'x2': 0.9983552631578947, 'y2': 0.8579419788561369}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值