MPI Maelstrom
Time Limit: 1000MS | Memory Limit: 10000K | |
Total Submissions: 11250 | Accepted: 6931 |
Description
BIT has recently taken delivery of their new supercomputer, a 32 processor Apollo Odyssey distributed shared memory machine with a hierarchical communication subsystem. Valentine McKee's research advisor, Jack Swigert, has asked her to benchmark the new system.
``Since the Apollo is a distributed shared memory machine, memory access and communication times are not uniform,'' Valentine told Swigert. ``Communication is fast between processors that share the same memory subsystem, but it is slower between processors that are not on the same subsystem. Communication between the Apollo and machines in our lab is slower yet.''
``How is Apollo's port of the Message Passing Interface (MPI) working out?'' Swigert asked.
``Not so well,'' Valentine replied. ``To do a broadcast of a message from one processor to all the other n-1 processors, they just do a sequence of n-1 sends. That really serializes things and kills the performance.''
``Is there anything you can do to fix that?''
``Yes,'' smiled Valentine. ``There is. Once the first processor has sent the message to another, those two can then send messages to two other hosts at the same time. Then there will be four hosts that can send, and so on.''
``Ah, so you can do the broadcast as a binary tree!''
``Not really a binary tree -- there are some particular features of our network that we should exploit. The interface cards we have allow each processor to simultaneously send messages to any number of the other processors connected to it. However, the messages don't necessarily arrive at the destinations at the same time -- there is a communication cost involved. In general, we need to take into account the communication costs for each link in our network topologies and plan accordingly to minimize the total time required to do a broadcast.''
``Since the Apollo is a distributed shared memory machine, memory access and communication times are not uniform,'' Valentine told Swigert. ``Communication is fast between processors that share the same memory subsystem, but it is slower between processors that are not on the same subsystem. Communication between the Apollo and machines in our lab is slower yet.''
``How is Apollo's port of the Message Passing Interface (MPI) working out?'' Swigert asked.
``Not so well,'' Valentine replied. ``To do a broadcast of a message from one processor to all the other n-1 processors, they just do a sequence of n-1 sends. That really serializes things and kills the performance.''
``Is there anything you can do to fix that?''
``Yes,'' smiled Valentine. ``There is. Once the first processor has sent the message to another, those two can then send messages to two other hosts at the same time. Then there will be four hosts that can send, and so on.''
``Ah, so you can do the broadcast as a binary tree!''
``Not really a binary tree -- there are some particular features of our network that we should exploit. The interface cards we have allow each processor to simultaneously send messages to any number of the other processors connected to it. However, the messages don't necessarily arrive at the destinations at the same time -- there is a communication cost involved. In general, we need to take into account the communication costs for each link in our network topologies and plan accordingly to minimize the total time required to do a broadcast.''
Input
The input will describe the topology of a network connecting n processors. The first line of the input will be n, the number of processors, such that 1 <= n <= 100.
The rest of the input defines an adjacency matrix, A. The adjacency matrix is square and of size n x n. Each of its entries will be either an integer or the character x. The value of A(i,j) indicates the expense of sending a message directly from node i to node j. A value of x for A(i,j) indicates that a message cannot be sent directly from node i to node j.
Note that for a node to send a message to itself does not require network communication, so A(i,i) = 0 for 1 <= i <= n. Also, you may assume that the network is undirected (messages can go in either direction with equal overhead), so that A(i,j) = A(j,i). Thus only the entries on the (strictly) lower triangular portion of A will be supplied.
The input to your program will be the lower triangular section of A. That is, the second line of input will contain one entry, A(2,1). The next line will contain two entries, A(3,1) and A(3,2), and so on.
The rest of the input defines an adjacency matrix, A. The adjacency matrix is square and of size n x n. Each of its entries will be either an integer or the character x. The value of A(i,j) indicates the expense of sending a message directly from node i to node j. A value of x for A(i,j) indicates that a message cannot be sent directly from node i to node j.
Note that for a node to send a message to itself does not require network communication, so A(i,i) = 0 for 1 <= i <= n. Also, you may assume that the network is undirected (messages can go in either direction with equal overhead), so that A(i,j) = A(j,i). Thus only the entries on the (strictly) lower triangular portion of A will be supplied.
The input to your program will be the lower triangular section of A. That is, the second line of input will contain one entry, A(2,1). The next line will contain two entries, A(3,1) and A(3,2), and so on.
Output
Your program should output the minimum communication time required to broadcast a message from the first processor to all the other processors.
Sample Input
5 50 30 5 100 20 50 10 x x 10
Sample Output
35
//题意:题目写了一大堆,本以为是个多难的题,看了之后才知道,都是废话;先给你一个n表示n个处理机器;然后直接给你一个//邻接矩阵,但是这个邻接矩阵是不全的,首先它是一个下三角,其次它还少了i == j的情况,其中A[i][j]表示从i号处理机传送信息//到j号处理机所需要的时间;其中没给的部分若i==j,则自己传给自己的时间为0;若是上三角,则对称过来就好,也就是说A[i]//[j]等于A[j][i];
//现在问你从1号将消息传给所有的其它点的最少需要多长时间;
//理解:先求出1号处理机到所有点的最短路(按所需时间为权值),然后遍历一下这些到1号虽短路的最大值是那个点,输出这个最大值(就是一个最短路加一个for循环没了。)
#include<iostream>
#include<cstdio>
#include<queue>
#include<algorithm>
#include<cstring>
#include<vector>
#include<cmath>
#define maxn 105
#define INF 0x3f3f3f3f
using namespace std;
int G[maxn][maxn]; //存图;
int dis[maxn];
int visit[maxn];
int n, m;
void dijkstra(){
fill(dis, dis+maxn, INF);
memset(visit, 0, sizeof(visit));
dis[1] = 0;
while(true){
int v = 0;
for(int i = 1; i <= n; i++){ //从未确定最短路的所有顶点集合中找出距离出发点最近的这个点
if(!visit[i] && (v == 0 || dis[i] < dis[v]))
v = i;
}
if(v == 0){ //不存在这样的点,换句话说所有点的最短路都求出来了,都标记了;
break;
}
visit[v] = 1; //对这个点进行标记,表示该点的最短路已经求出;
for(int i = 1; i <= n; i++){ //用这个新确定最短路的点去更新其它点的最短路;
if(dis[i] > dis[v] + G[v][i])
dis[i] = dis[v] + G[v][i];
}
}
return;
}
int main(){
char str[20];
while(~scanf("%d", &n)){
for(int i = 1; i <= n; i++){ //初始化这张图;
for(int j = 1; j <= n; j++){
if(i == j)
G[i][j] = 0;
else
G[i][j] = INF;
}
}
for(int i = 2; i <= n; i++){
for(int j = 1; j <= i-1; j++){
scanf("%s", str);
if(str[0] != 'x'){
G[i][j] = G[j][i] = atoi(str);
}
}
}
dijkstra();
int ans = -INF;
for(int i = 1; i <= n; i++){
ans = max(ans, dis[i]);
}
cout << ans << endl;
}
return 0;
}