P2P non-blocking

最新推荐文章于 2022-05-20 09:20:33 发布

原创最新推荐文章于 2022-05-20 09:20:33 发布 · 866 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#p2p #application #buffer #processing #performance #system

MPI 专栏收录该内容

35 篇文章

订阅专栏

本文介绍MPI中非阻塞通信的基本概念及其在并行计算中的优势，包括非阻塞发送与接收函数的使用方法，并通过示例程序展示了如何实现计算与通信的重叠以提高效率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. 非阻塞(non-blocking)通信

(1)非阻塞的sender

(a)几乎可以立刻返回去干别的事情。不管数据是否从application buffer到了system buffer，或者数据到了receiver端的application buffer或者system buffer.

(b)发送操作由MPI lib选择合适的时间去完成。

(c)sender最好不要去立刻修改刚发送数据的内存单元(send buffer)(不安全啊！)，但可以操作其他的内存单元。

(d)若要操作刚发送数据的内存单元(send buffer)，须通过*wait*函数确认发送完成。

(e)非阻塞的发送可以实现计算与通信重叠(overlap computation with communication and exploit possible performance gains)。

(2) 非阻塞receiver(原理同非阻塞的sender)

2. 函数

MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)

Identifies an area in memory to serve as a send buffer. Processing continues immediately without waiting for the message to be copied out from the application buffer. A communication request handle is returned for handling the pending message status. The program should not modify the application buffer until subsequent calls to MPI_Wait or MPI_Test indicate that the non-blocking send has completed.

MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)

Identifies an area in memory to serve as a receive buffer. Processing continues immediately without actually waiting for the message to be received and copied into the the application buffer. A communication request handle is returned for handling the pending message status. The program must use calls to MPI_Wait or MPI_Test to determine when the non-blocking receive operation completes and the requested message is available in the application buffer.

3. 举例

#include"mpi.h"
#include<stdio.h>

int main(int argc, char *argv[]){
        int totalNumTasks, rankID;

        MPI_Init(&argc, &argv);
        MPI_Comm_size(MPI_COMM_WORLD, &totalNumTasks);
        MPI_Comm_rank(MPI_COMM_WORLD, &rankID);
        //get the host where this process is running
        int  nameLength;
        char processor_name[MPI_MAX_PROCESSOR_NAME];
        MPI_Get_processor_name(processor_name,&nameLength);

        int prevRankID = rankID - 1;
        int nextRankID = rankID + 1;
        if(rankID == 0)  prevRankID = totalNumTasks - 1;
        if(rankID == (totalNumTasks - 1)) nextRankID = 0;

        int count = 1;
        MPI_Request request[4];

        char recvBuf1;
        char sendBuf1 = 'R';
        int tag1 = 1;
        MPI_Irecv(&recvBuf1, count, MPI_CHAR, prevRankID, tag1, MPI_COMM_WORLD, &request[0]);
        MPI_Isend(&sendBuf1, count, MPI_CHAR, nextRankID, tag1, MPI_COMM_WORLD, &request[1]);

        char recvBuf2;
        char sendBuf2 = 'L';
        int tag2 = 2;
        MPI_Irecv(&recvBuf2, count, MPI_CHAR, nextRankID, tag2, MPI_COMM_WORLD, &request[2]);
        MPI_Isend(&sendBuf2, count, MPI_CHAR, prevRankID, tag2, MPI_COMM_WORLD, &request[3]);
        //after, non-blocking send and receive, process can do something except modifying the application buffer
        //Here, application buffer is recvBuf1, sendBuf1, recvBuf2, sendBuf2
        //which can overlap the communication and computing 
        //Indeed, you can use other memory areas
        printf("My rankID = %d on Processor = %s, I can do something here.....\n", rankID, processor_name);
        //Now to check after MPI_Waitall, after it, the application buffer is safe to reuse 
        MPI_Status status[4];
        MPI_Waitall(4, request, status);
        printf("My rankID = %d, recvBuf1 = %c && source = %d && tag = %d\n", 
                rankID, recvBuf1, status[0].MPI_SOURCE, status[0].MPI_TAG);

        printf("My rankID = %d, recvBuf2 = %c && source = %d && tag = %d\n", 
                rankID, recvBuf2, status[2].MPI_SOURCE, status[2].MPI_TAG);
        printf("My rankID = %d, Now, my application buffer is safe to reuse.\n", rankID);
        MPI_Finalize();
        return 0;
}

4. 编译执行

[amao@amao991 mpi-study]$ mpicc p2pNonBlockingOnWhichProcessor.c 
[amao@amao991 mpi-study]$ mpiexec -n 3 -f machinefile ./a.out 
My rankID = 0 on Processor = amao991, I can do something here.....
My rankID = 2 on Processor = amao992, I can do something here.....
My rankID = 1 on Processor = amao991, I can do something here.....
My rankID = 0, recvBuf1 = R && source = 2 && tag = 1
My rankID = 0, recvBuf2 = L && source = 1 && tag = 2
My rankID = 0, Now, my application buffer is safe to reuse.
My rankID = 1, recvBuf1 = R && source = 0 && tag = 1
My rankID = 1, recvBuf2 = L && source = 2 && tag = 2
My rankID = 1, Now, my application buffer is safe to reuse.
My rankID = 2, recvBuf1 = R && source = 1 && tag = 1
My rankID = 2, recvBuf2 = L && source = 0 && tag = 2
My rankID = 2, Now, my application buffer is safe to reuse.

5. 总结

(1)本例中3个进程构成一个双向环，每个进程接收到消息后，就转手发送

(2)由于是采用了non-blocking发送/接受，因此，函数调用完成后，进程可以继续执行其他语句(这里用printf一条语句来示例)，只要不操作接受/发送buffer就好

(3)到最后，若要查看接收buffer，须调用MPI_Waitall以确保接受完成了。