2080Ti双卡开启NVLink

本文介绍了如何在Ubuntu 22.04系统下给2080Ti双卡开启NVLink的方法。
首先要确保安装好CUDA及配置好环境变量

1、开启

nvidia-smi -pm 1
sudo reboot
nvidia-smi topo -m

结果如下:

(base) root@myd-gpu:~# nvidia-smi topo -m
        GPU0    GPU1    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      NV2     0-11,24-35      0               N/A
GPU1    NV2      X      12-23,36-47     1               N/A

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

2、测试

下载官方例程

git clone https://github.com/NVIDIA/cuda-samples.git

编译运行

pip install cmake
cd cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest
mkdir build && cd build
cmake ..
make -j$(nproc)
./p2pBandwidthLatencyTest

查看结果

(base) root@myd-gpu:~/cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest/build# ./p2pBandwidthLatencyTest 
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, NVIDIA GeForce RTX 2080 Ti, pciBusID: 4, pciDeviceID: 0, pciDomainID:0
Device: 1, NVIDIA GeForce RTX 2080 Ti, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=1 CAN Access Peer Device=0

***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0     1
     0       1     1
     1       1     1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 541.95   5.67 
     1   5.72 536.94 
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
   D\D     0      1 
     0 523.63  47.11 
     1  47.11 536.57 
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 535.84   8.49 
     1   8.44 533.98 
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 534.00  94.18 
     1  94.13 533.34 
P2P=Disabled Latency Matrix (us)
   GPU     0      1 
     0   1.48  16.92 
     1  14.64   1.34 

   CPU     0      1 
     0   3.10   9.39 
     1   9.35   3.30 
P2P=Enabled Latency (P2P Writes) Matrix (us)
   GPU     0      1 
     0   1.34   1.46 
     1   1.53   1.34 

   CPU     0      1 
     0   2.96   2.60 
     1   2.73   3.30 

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
RTX 2080 Ti NVLink是指使用NVIDIA的NVLink技术将两张RTX 2080 Ti连接在一起以实现多GPU并行计算和通信。NVLink是一种高速、低延迟的互联技术,它可以提供比传统PCIe接口更高的带宽和更低的延迟,从而实现更好的多GPU性能。RTX 2080 Ti NVLink可以通过NVLink连接板将两张显物理上连接在一起,以实现高效的数据传输和协同计算。这使得在支持多GPU加速的应用程序中,可以将计算负载分摊到多个GPU上并提高整体性能。 NCCL是Nvidia Collective multi-GPU Communication Library的简称,它是一个实现多GPU的collective communication通信库,可以提供高效的数据传输和通信功能。NCCL通过优化在PCIe、NVLink和InfiniBand等互联技术上的通信速度,将多个GPU之间的数据传输和通信效率最大化。 在进行RTX 2080 Ti NVLink配置时,可以使用nvidia-smi命令来测试GPU的连接拓扑。该命令可以显示GPU之间的连接方式,例如通过PCIe或NVLink等。通过查看连接拓扑,可以确认RTX 2080 Ti是否成功使用NVLink进行连接。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [NVIDIA-Turing-Architecture-WhitepaperNVIDIA-图灵架构的白皮书](https://download.youkuaiyun.com/download/weixin_40878684/10682852)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] - *2* *3* [4RTX2080Ti深度学习工作站是可行的 - NCCL](https://blog.youkuaiyun.com/danteLiujie/article/details/102901154)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值