【RDMA】RDMA 学习资料总目录

目录

RDMA技术分享

 RDMA技术分享

RDMA技术详解

RDMA编程

RDMA 网络

ROCE|iWarp

性能优化

配置和特性优化

Qos流控 

命令和测试

官网、文档和资料


作者:bandaoyu,随时更新,源文连接:https://blog.youkuaiyun.com/bandaoyu/article/details/120485737

RDMA技术分享


 RDMA技术分享

1. RDMA概述
https://blog.youkuaiyun.com/bandaoyu/article/details/112859853
1. RDMA概述 - 知乎
2. 比较基于Socket与RDMA的通信
https://blog.youkuaiyun.com/bandaoyu/article/details/112861399
3. RDMA基本元素和编程基础
https://blog.youkuaiyun.com/bandaoyu/article/details/112861431
4. RDMA操作类型|WRITE|READ
https://blog.youkuaiyun.com/bandaoyu/article/details/112861454
5. RDMA基本服务类型
https://blog.youkuaiyun.com/bandaoyu/article/details/112861469
6. RDMA之Memory Region
https://blog.youkuaiyun.com/bandaoyu/article/details/112861488
7. RDMA之Protection Domain
https://blog.youkuaiyun.com/bandaoyu/article/details/113115845
8. RDMA之Address Handle
https://blog.youkuaiyun.com/bandaoyu/article/details/113116613
9. RDMA之Queue Pair
https://blog.youkuaiyun.com/bandaoyu/article/details/113118302
10. RDMA之Completion Queue
https://zhuanlan.zhihu.com/p/259650980
11. RDMA之Shared Receive Queue
https://blog.youkuaiyun.com/bandaoyu/article/details/113120391
12. RDMA之Verbs|OFED
https://blog.youkuaiyun.com/bandaoyu/article/details/113125244
13. RDMA之用户态与内核态交互
https://blog.youkuaiyun.com/bandaoyu/article/details/113125473
14. RDMA之Memory Window
https://blog.youkuaiyun.com/bandaoyu/article/details/120485072
15. RDMA之RoCE & Soft-RoCE
https://blog.youkuaiyun.com/bandaoyu/article/details/120485632
16. RDMA之DDP(Direct Data Placement)
https://blog.youkuaiyun.com/bandaoyu/article/details/120485693

17. RDMA之RDMAP(Remote Direct Memory Access Protocol)

https://blog.youkuaiyun.com/bandaoyu/article/details/125234164?spm=1001.2014.3001.5501

18. RDMA之MPA(Marker PDU Aligned framing)

https://blog.youkuaiyun.com/bandaoyu/article/details/125234209?spm=1001.2014.3001.5501

19. RDMA之iWARP & Soft-iWARP

https://blog.youkuaiyun.com/bandaoyu/article/details/125234243?spm=1001.2014.3001.5501

20. RDMA之Pyverbs(Python Verbs)

https://blog.youkuaiyun.com/bandaoyu/article/details/125234422?spm=1001.2014.3001.5502

21. RDMA之内存地址基础知识

https://blog.youkuaiyun.com/bandaoyu/article/details/125234262?spm=1001.2014.3001.5502

22. RDMA之基于Socket API的QP间建链

https://blog.youkuaiyun.com/bandaoyu/article/details/125234310?spm=1001.2014.3001.5502

23. RDMA之基于CM API的QP间建链

https://blog.youkuaiyun.com/bandaoyu/article/details/125234340?spm=1001.2014.3001.5502

RDMA技术详解

【RDMA】技术详解(一):RDMA概述

https://blog.youkuaiyun.com/bandaoyu/article/details/112859853

【RDMA】技术详解(二):Send Receive操作

https://blog.youkuaiyun.com/bandaoyu/article/details/112859932

【RDMA】技术详解(三):理解RDMA Scatter Gather List|聚散表

https://blog.youkuaiyun.com/bandaoyu/article/details/112859981

【RDMA】技术详解(四):RDMA之Verbs和编程步骤

https://blog.youkuaiyun.com/bandaoyu/article/details/112860396

RDMA编程

【RDMA】RDMA编程入门

【RDMA】RDMA编程入门--编辑中_bandaoyu的博客-优快云博客

【RDMA】RDMA 编程实例(rdma_cm API):

https://blog.youkuaiyun.com/bandaoyu/article/details/116062334

【RDMA】RDMA SEND/WRITE编程实例(IBV Verbs ):

https://blog.youkuaiyun.com/bandaoyu/article/details/115988785

https://blog.youkuaiyun.com/bandaoyu/article/details/112852477

verbs 编程注意事项

https://blog.youkuaiyun.com/bandaoyu/article/details/124327417

【RDMA】rdma_cm和verbs的区别|libibverbs和librdmacm的区别:

https://blog.youkuaiyun.com/bandaoyu/article/details/115668933

https://blog.youkuaiyun.com/bandaoyu/article/details/120723270

使用socket api编写RDMA程序?

https://blog.youkuaiyun.com/bandaoyu/article/details/120726746

RDMA 网络

ROCE|iWarp

https://blog.youkuaiyun.com/bandaoyu/article/details/117560876

IWarp模式貌似只能用librdmacm建立连接而无法用libibverbs :

Connecting Queue Pairs - RDMAmojo RDMAmojo

性能优化

配置和特性优化

【RDMA】基于RoCE的应用程序的MTU注意事项|探测网络中的MTU设置

https://blog.youkuaiyun.com/bandaoyu/article/details/116706925

【翻译】低延迟选择 RoCE 或 iWARP? 

https://blog.youkuaiyun.com/bandaoyu/article/details/119001100

InfiniBand如何工作和小消息通信性能优化方案

https://blog.youkuaiyun.com/bandaoyu/article/details/119204643

IBV_SEND_INLINE和IBV_SEND_SIGNALED的原理|RDMA小消息通信性能优化 

https://blog.youkuaiyun.com/bandaoyu/article/details/119207147

使用‘无信号完成’(Working with Unsignaled completions)|IBV_SEND_SIGNALED 

https://blog.youkuaiyun.com/bandaoyu/article/details/119145598

infiniband提升Redis性能|UC和RC时延比较|RC和UD性能比较 

https://blog.youkuaiyun.com/bandaoyu/article/details/117081940

优化 RDMA 代码的提示和技巧 

https://blog.youkuaiyun.com/bandaoyu/article/details/120713020

fork()-->ibv_fork_init的使用对性能的影响

https://blog.youkuaiyun.com/bandaoyu/article/details/124327417?spm=1001.2014.3001.5501

qp数量和RDMA性能(节选翻译)|连接数_qp数量

​​​​​​​https://blog.youkuaiyun.com/bandaoyu/article/details/122947096

NUMA对RDMA单边操作影响的性能评估-https://blog.youkuaiyun.com/bandaoyu/article/details/146393071

Qos流控 

无损网络和PFC(基于优先级的流量控制)|ECN

https://blog.youkuaiyun.com/bandaoyu/article/details/115346857

RoCE网络QoS|应用层设置PFC等级|Tos|Priority|TC 

https://blog.youkuaiyun.com/bandaoyu/article/details/115633835

基于RoCE v1配置PFC (非讲原理) 

https://blog.youkuaiyun.com/bandaoyu/article/details/115582637

​​​​​​​​​​​​​​

ZTR(Zero Touch RoCE)技术(无需配置PFC和ECN)

https://blog.youkuaiyun.com/bandaoyu/article/details/145120618

低时延网络实践---百度高级项目|PFC+ECN 

https://blog.youkuaiyun.com/bandaoyu/article/details/118498539

优化理论指导

https://download.youkuaiyun.com/download/bandaoyu/33184815

命令和测试

InfiniBand IB常用命令|历史命令记录_ 

https://blog.youkuaiyun.com/bandaoyu/article/details/115798693

RDMA通信测试工具|RDMA信息查询工具 

https://blog.youkuaiyun.com/bandaoyu/article/details/115798045

RDMA抓包|ibdump 用法说明 

https://blog.youkuaiyun.com/bandaoyu/article/details/115791233

infiniband网卡安装|InfiniBand 连接和状态诊断工具|测试RDMA网卡是否正常工作 

https://blog.youkuaiyun.com/bandaoyu/article/details/115906185

错误记录

https://blog.youkuaiyun.com/bandaoyu/article/details/116539866

官网、文档和资料

【RDMA】文档和教程和相关知识;https://blog.youkuaiyun.com/bandaoyu/article/details/112861368

https://www.freesion.com/article/8223180236/

mellanox官方社区能找到很多你需要的东西:

Mellanox Interconnect Community

 编程过程,真正有用的还是官方的手册,获取mellanox网卡方编程手册的方法:https://docs.nvidia.com/networking/software/index.html

https://docs.nvidia.com/networking/software/adapter-software/index.html

还有一些已经翻译过的:
RDMA中英文编程手册1.7-:https://download.youkuaiyun.com/download/bandaoyu/87354001
 

RDMA 学术或测试研究 

《Scalable RDMA RPC on Reliable Connection with
Efficient Resource Sharing》:http://storage.cs.tsinghua.edu.cn/papers/eurosys19-scalerpc.pdf/

(陈游旻、陆游游、舒继武:http://storage.cs.tsinghua.edu.cn/misc/cym/cym-cv-ch.pdf/

《FaSST: Fast, Scalable and Simple Distributed Transactions with
Two-sided (RDMA) Datagram RPCs》:https://www.cs.cmu.edu/~dga/papers/fasst_osdi.pdf

《StaR: Breaking the Scalability Limit for RDMA》 https://icnp21.cs.ucr.edu/papers/icnp21camera-paper30.pdf

 下面原文:infiniband网卡安装、使用总结 - 山河故人abin - 博客园 

最近多次安装、使用infiniband网卡,每次都要到处寻找相关资料,所以决定做此总结,方便查找。

1. 基础知识

首先,得了解什么是RDMA,贴几个资料:

深入浅出全面解析RDMA

RDMA技术详解(一):RDMA概述

RDMA技术详解(二):RDMA Send Receive操作

然后得了解如何实现,这两个可以有个初步了解:

RDMA编程:事件通知机制

RDMA read and write with IB verbs

也下了个中文版,但我感觉英文版看着更好。中文版下载:

百度云: 百度网盘 请输入提取码 提取码: rm8i

蓝奏云:https://wwa.lanzous.com/iXUd6jm7qla 密码: 4aps

RDMA编程入门可参考的项目:

GitHub - tarickb/the-geek-in-the-corner: Sample code from thegeekinthecorner.com

GitHub - jcxue/RDMA-Tutorial: A tutorial on RDMA based programming using code examples

### RDMA驱动开发教程 RDMA(Remote Direct Memory Access)驱动开发涉及对硬件和内核模块的深入理解。为了实现高效的RDMA通信,开发者需要掌握网络协议栈、设备驱动程序以及操作系统内存管理机制等知识。 #### 1. 开发环境搭建 在开始学习RDMA驱动开发之前,首先需要准备好一个支持RDMA的环境。如果没有实际的RDMA网卡或InfiniBand卡,可以使用软件模拟工具如Soft-iWARP来构建一个虚拟的RDMA环境[^1]。Soft-iWARP是一个开源项目,可以在普通的以太网上提供类似RDMA的功能。你可以从GitHub上克隆该项目并按照相关教程进行安装配置。 #### 2. 学习资料与社区资源 - **官方文档**:OpenFabrics Alliance提供了关于RDMA技术的标准定义和技术细节,是了解底层原理的重要来源[^3]。 - **博客文章**:例如RDMAmojo这样的博客由Dotan Barak维护,专门讨论RDMA技术和编程实践,适合进阶学习者获取实用信息。 - **学术论文**:通过Google Scholar搜索可以获得大量有关RDMA性能分析、应用场景的研究成果[^3]。这些文献通常包含最新的研究成果和技术趋势。 - **书籍**:虽然直接针对RDMA驱动开发的书籍不多,但《创新者的困境》中对于颠覆性技术的发展路径有深刻见解,有助于理解RDMA如何逐步渗透到不同领域中去[^2]。 #### 3. 编程接口与API 熟悉RDMA编程接口是关键一步。Linux系统下主要依赖于libibverbs库提供的API来进行RDMA操作。这个库允许应用程序直接访问远程主机内存而无需CPU干预,从而减少延迟并提高吞吐量。此外,还需要了解verbs API的具体用法,包括创建保护域(Protection Domain)、完成队列(Completion Queue)等基本概念。 ```c #include <infiniband/verbs.h> struct ibv_device **dev_list; struct ibv_context *context; // 获取所有可用的IB设备 dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); exit(EXIT_FAILURE); } // 打开第一个找到的设备 context = ibv_open_device(dev_list[0]); if (!context) { fprintf(stderr, "Couldn't open device %s\n", ibv_get_device_name(dev_list[0])); exit(EXIT_FAILURE); } ``` 这段代码展示了如何获取本地机器上的InfiniBand设备列表,并尝试打开第一个找到的设备。这是初始化任何基于InfiniBand的应用程序所必需的第一步。 #### 4. 驱动开发基础 编写RDMA驱动时,必须考虑以下几个方面: - **内存注册**:为了能够执行零拷贝传输,用户空间缓冲区必须被注册到内核中,这样DMA引擎才能安全地访问它们。 - **连接管理**:建立和维护端点之间的可靠连接至关重要。这可能涉及到使用RoCE (RDMA over Converged Ethernet) 或者InfiniBand子层的服务。 - **错误处理**:设计健壮的错误恢复机制以应对网络故障或其他异常情况。 ### 原理详解 RDMA的核心在于它能够在两个节点之间直接读写对方内存中的数据而不经过CPU参与,这意味着减少了数据传输过程中的延迟和CPU开销。这种能力使得RDMA特别适用于高性能计算(HPC)、云计算及大规模分布式存储等领域。具体来说,当发送方想要向接收方传输数据时,它会发送一条包含目标地址和长度的消息给接收方的NIC(Network Interface Card),然后NIC就会自动将数据从发送方的内存复制到接收方指定的位置,全程都不需要调用操作系统内核或者中断CPU处理流程。 ---
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值