ubuntu 22.04下面安装cuda、cudnn等的配置过程

一、正常安装ubuntu 22.04系统,安装以后sudo apt update,sudo apt upgrade更新软件到最新版。       

二、安装cuda

        到下面的地址去下载cuda离线安装包,根据cpu指令集架构等选择正确的选项:

https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local

 出来的选项内容如下所示:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4

第一个wget下载的内容不大,没有必要单独下载,第二个wget下载的安装包大概3.6G左右,需要用迅雷等下载加速,下载好以后按照以上的命令顺序执行即可。

sudo apt-get install -y cuda-drivers

三、安装cudnn

        cudnn的安装过程与安装与cuda的安装过程类似,打开下面的网址并根据实际情况选择合适的选项:

https://developer.nvidia.com/cudnn-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local
wget https://developer.download.nvidia.com/compute/cudnn/9.1.1/local_installers/cudnn-local-repo-ubuntu2204-9.1.1_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2204-9.1.1_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-9.1.1/cudnn-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cudnn

四、安装完成以后配置一下系统的环境变量,导出cuda和cudnn相关头文件与库文件编译搜索路径

sudo vi /etc/profile

export PATH=/usr/local/cuda/bin:$PATH
export CPATH=$CPATH:/usr/include:/usr/local/cuda/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64

source /etc/profile

五:验证

   执行nvidia-smi验证显卡驱动安装是否正确,正确的话应该输出类似下面这样:

Sat May 11 15:27:56 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 9999 ...    Off |   00000000:01:00.0  On |                  N/A |
| 30%   35C    P8             20W /  250W |      64MiB / 102400MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1035      G   /usr/lib/xorg/Xorg                             54MiB |
|    0   N/A  N/A      1106      G   /usr/bin/gnome-shell                            7MiB |
+-----------------------------------------------------------------------------------------+

    如果安装的驱动有问题,可能会出现下面这样的提示:

root@server:~# nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 550.54

  此时的处理方案是:

sudo apt-get remove --purge '^nvidia-.*'
sudo rm /etc/modprobe.d/blacklist-nvidia.conf
sudo rm /lib/modprobe.d/blacklist-nvidia.conf
apt-get update
apt-get install nvidia-driver-550

  安装指定版本的驱动以后reboot重启系统应该就没问题了。

 执行nvcc --version验证cuda版本,正常情况下输出如下:

sudo nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

cudnn版本验证不是很方便,可以写个小程序验证一下,例如新建main.cpp,内容如下:

#include <cudnn.h>
#include <iostream>

int main() {
    std::cout << "cuDNN Version: " << cudnnGetVersion() << std::endl;
    return 0;
}
 g++ main.cpp -o cudnn_test -I/usr/include -I/usr/local/cuda/include -L/usr/lib/x86_64-linux-gnu -L/usr/local/cuda/lib64 -lcudnn -lcudart

  正常情况下应该可以编译得到cudnn_test的可执行文件,执行以后可以打印cudnn的版本信息:

./cudnn_test
cuDNN Version: 90101

 六、安装对docker的gpu支持

     根据docker官方文档资料正常安装docker,这里跳过docker的安装过程记录,以下内容是安装对docker的gpu支持,使docker实例可以访问宿主机gpu算力资源,依次执行以下命令:

# 设置稳定版仓库和 GPG 密钥
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# 安装 nvidia-docker2 包并重启 Docker 服务
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

 安装完成以后拉取一个官方镜像访问一下gpu算力资源,

 docker pull nvidia/cuda:12.4.1-cudnn-runtime-ubuntu20.04

  docker镜像拉回来以后执行一下试试:

 

 docker run --gpus all nvidia/cuda:12.4.1-cudnn-runtime-ubuntu20.04 nvidia-smi

  正常情况应该输出如下内容:


==========
== CUDA ==
==========

CUDA Version 12.4.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Sat May 11 07:46:25 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX xxxx ...    Off |   00000000:01:00.0  On |                  N/A |
| 30%   36C    P8             21W /  250W |      64MiB /   xxxxMiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

 这样就彻底安装完了。最后说一句,盗取文章的死全家,关键是还有傻子私信骚扰我说我盗取别人的文章,本文首发于http://blog.youkuaiyun.com/peihexian

Ubuntu 22.04 (Focal Fossa) 操作系统是一个基于Linux的桌面发行版,它以其稳定性、易用性和社区支持而知名。如果你想要在这样的环境下使用RTX 4090显卡,这是一款高端的NVIDIA GPU,专为深度学习和科学计算等高性能任务设计。 CUDA(Compute Unified Device Architecture)是NVIDIA提供的一种并行计算平台和编程模型,用于利用GPU加速各种计算密集型任务,包括机器学习和深度学习。在Ubuntu 22.04安装CUDA,你需要下载对应版本的CUDA Toolkit,并配置CUDA环境变量。 CUDNN(Convolutional Neural Networks)是NVIDIA针对深度学习库的一个优化库,特别适用于处理卷积神经网络。安装CUDNN需要先安装CUDA,然后从NVIDIA官网下载并安装相应版本的CUDNN库。 PyTorch是一个开源的深度学习框架,非常适合动态图操作,易于理解和调试。要在Ubuntu 22.04上运行PyTorch,并与CUDACUDNN配合,你需要安装Python的torch库,通常会通过pip安装,并确认它链接到了正确的CUDACUDNN版本。 安装步骤大致如下: 1. 更新包列表并安装依赖项: ``` sudo apt-get update sudo apt-get install -y build-essential cmake git libncurses5-dev pkg-config libopenblas-dev libhdf5-dev libzlib-dev ``` 2. 安装CUDACUDNN: - 下载CUDA Toolkit - 设置CUDA环境变量 - 安装CUDNN (如果需要的话) 3. 安装Python和PyTorch: ``` sudo apt-get install python3-pip pip3 install torch torchvision torchaudio cudatoolkit=11.6 -f https://download.pytorch.org/whl/cu116/torch_stable.html ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

peihexian

你的鼓励是我创作的动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值