MPI并行计算与矩阵(2)

本文详细解析了一个使用MPI实现的并行矩阵乘法代码。文章首先介绍了如何将矩阵A按行分配给各个处理器,并说明了如何进行矩阵乘法运算。接着解释了主处理器如何收集结果并向用户展示最终的输出。最后,文章还讨论了一些运行中可能遇到的问题。

1The body of code

#include"mpi.h"
#include"stdio.h"
#include<stdlib.h>
const int rows = 3; //the rows of matrix
const int cols = 2; //the cols of matrix
int main(int argc, char **argv)
{
int rank,size,anstag;
int A[rows][cols],B[cols],C[rows];
int masterpro,buf[cols],ans,cnt;
double starttime,endtime;
double tmp,totaltime;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
if(size<2){
printf("Error:Too few processors!\n");
MPI_Abort(MPI_COMM_WORLD,99);
}

if (0==rank){
for(int i=0;i<cols;i++)
{
B[i]=i;
for(int j=0;j<rows;j++){
A[j][i]=i+j;
}
}

////output matrix A and vector B ///
printf("\t\tmarix A \n");
for(int i=0;i<rows;i++){
for(int j=0;j<cols;j++){
printf("%d\t",A[i][j]);
}
printf("\n");
}

printf("\t\t vector B\n");
for(int i=0;i<cols;i++)
printf("%d\t",B[i]);
printf("\n");
//bcast the B vector to all slave processors
MPI_Bcast(B,cols,MPI_INT,0,MPI_COMM_WORLD);
//partition the A matrix to all slave processors
for(int i=1;i<size;i++)
{
for(int k=i-1;k<rows;k+=size-1)
{
for(int j=0;j<cols;j++)
{
buf[j]=A[k][j];
}
MPI_Send(buf,cols,MPI_INT,i,k,MPI_COMM_WORLD);
}
}
}

else{
MPI_Bcast(B,cols,MPI_INT,0,MPI_COMM_WORLD);
//every processors receive the part of A matrix, and make Mul Operator with B vector
for(int k= rank-1;k<rows;k+=size-1){
MPI_Recv(buf,cols,MPI_INT,0,k,MPI_COMM_WORLD,&status);
ans=0;
for(int j=0;j<cols;j++)
{
ans=buf[j]*B[j];
}
MPI_Send(&ans,1,MPI_INT,0,k,MPI_COMM_WORLD);
}
}
if(0==rank){
//receive the result from all slave processor
printf("\t\toutput C Vector\n");
for(int i=0;i<rows;i++)
{
MPI_Recv(&ans,1,MPI_INT,MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,&status);
anstag = status.MPI_TAG;
C[anstag] =ans;
printf("%d\t",C[i]);
}
printf("\n");
}

MPI_Finalize();
return 0;
}

image

2 Question Part

1) How to understant this part of code? partition the A matrix to all slave processors

 

for(int i=1;i<size;i++)
{
for(int k=i-1;k<rows;k+=size-1)
{
for(int j=0;j<cols;j++)
{
buf[j]=A[k][j];
}
MPI_Send(buf,cols,MPI_INT,i,k,MPI_COMM_WORLD);
}
}

**************************************************************

A[rows][cols] was divided by each rows.

buf[cols].

MPI_Send(buf,cols,MPI_INT,i,k,MPI_COMM_WORLD);

i ----- the destination of this message

k------the tag of this message

the same processor (the same i),receive the message with diffent k, and each k has the same interval size-1.

So that the same processor will receive the same interval k, ie the equal interval cols of A matrix

2) after MPI_FInalize(), whether all the parreling enviroment stoped ?

 

MPI_Finalize() means that all the resource that applied by MPI_intial is set free, But it does mean after the MPI_Finalize(),

there is only one processor works!

This question, you can refer to the blog for details

http://bbs.youkuaiyun.com/topics/391038020

 

 

3 more informations

image

sometimes it works out like this, So what is wrong with the Code?

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值