MPI Maelstrom
| Time Limit: 1000MS | Memory Limit: 10000K | |
| Total Submissions: 4821 | Accepted: 2935 |
Description
BIT has recently taken delivery of their new supercomputer, a 32 processor Apollo Odyssey distributed shared memory machine with a hierarchical communication subsystem. Valentine McKee's research advisor, Jack Swigert, has asked
her to benchmark the new system.
``Since the Apollo is a distributed shared memory machine, memory access and communication times are not uniform,'' Valentine told Swigert. ``Communication is fast between processors that share the same memory subsystem, but it is slower between processors that are not on the same subsystem. Communication between the Apollo and machines in our lab is slower yet.''
``How is Apollo's port of the Message Passing Interface (MPI) working out?'' Swigert asked.
``Not so well,'' Valentine replied. ``To do a broadcast of a message from one processor to all the other n-1 processors, they just do a sequence of n-1 sends. That really serializes things and kills the performance.''
``Is there anything you can do to fix that?''
``Yes,'' smiled Valentine. ``There is. Once the first processor has sent the message to another, those two can then send messages to two other hosts at the same time. Then there will be four hosts that can send, and so on.''
``Ah, so you can do the broadcast as a binary tree!''
``Not really a binary tree -- there are some particular features of our network that we should exploit. The interface cards we have allow each processor to simultaneously send messages to any number of the other processors connected to it. However, the messages don't necessarily arrive at the destinations at the same time -- there is a communication cost involved. In general, we need to take into account the communication costs for each link in our network topologies and plan accordingly to minimize the total time required to do a broadcast.''
单源点最短路问题。求出源点到其余各点的最短路径,由于可以从源点出发同时遍历多条路径,所以最短路径中的最大值就是所求答案。
AC Code:
``Since the Apollo is a distributed shared memory machine, memory access and communication times are not uniform,'' Valentine told Swigert. ``Communication is fast between processors that share the same memory subsystem, but it is slower between processors that are not on the same subsystem. Communication between the Apollo and machines in our lab is slower yet.''
``How is Apollo's port of the Message Passing Interface (MPI) working out?'' Swigert asked.
``Not so well,'' Valentine replied. ``To do a broadcast of a message from one processor to all the other n-1 processors, they just do a sequence of n-1 sends. That really serializes things and kills the performance.''
``Is there anything you can do to fix that?''
``Yes,'' smiled Valentine. ``There is. Once the first processor has sent the message to another, those two can then send messages to two other hosts at the same time. Then there will be four hosts that can send, and so on.''
``Ah, so you can do the broadcast as a binary tree!''
``Not really a binary tree -- there are some particular features of our network that we should exploit. The interface cards we have allow each processor to simultaneously send messages to any number of the other processors connected to it. However, the messages don't necessarily arrive at the destinations at the same time -- there is a communication cost involved. In general, we need to take into account the communication costs for each link in our network topologies and plan accordingly to minimize the total time required to do a broadcast.''
单源点最短路问题。求出源点到其余各点的最短路径,由于可以从源点出发同时遍历多条路径,所以最短路径中的最大值就是所求答案。
AC Code:
#include <iostream>
#include <cstdio>
#include <vector>
#include <cstring>
#include <algorithm>
#include <queue>
using namespace std;
#define src 1
#define maxN 100
#define inf 0xfffffff
#define _dest int
#define _cost int
vector<pair<_dest, _cost> > g[maxN + 2];
int dist[maxN + 2]; //src到顶点i的最短路径长为dist[i]
bool inQue[maxN + 2];
int n;
void SPFA()
{
queue<int> que;
memset(inQue, false, sizeof(inQue));
que.push(src);
inQue[src] = true;
while (! que.empty())
{
int cur = que.front();
que.pop();
inQue[cur] = false;
for (vector<pair<_dest, _cost> >::const_iterator it = g[cur].begin();
it != g[cur].end(); ++it)
{
if (dist[it->first] > it->second + dist[cur])
{
dist[it->first] = it->second + dist[cur];
if (! inQue[it->first])
{
que.push(it->first);
inQue[it->first] = true;
}
}
}
}
}
int main()
{
char s[10];
int cost;
while (scanf("%d", &n) != EOF)
{
for (int i = 1; i <= n; ++i)
{
dist[i] = inf;
}
dist[src] = 0;
for (int i = 2; i <= n; ++i)
{
for (int j = 1; j < i; ++j)
{
scanf("%s", s);
if (s[0] != 'x')
{
cost = atoi(s);
g[i].push_back(make_pair(j, cost));
g[j].push_back(make_pair(i, cost));
// 以下代码是错误的,某边连接顶点u和v,该边却不一定是最短路径
// if (j == src)
// {
// dist[i] = cost;
// }
}
}
}
SPFA();
int ans = 0;
for (int i = 1; i <= n; ++i)
{
if(ans < dist[i])
{
ans = dist[i];
}
}
printf("%d\n", ans);
}
return 0;
}
探讨了在分布式共享内存计算机上优化消息传递接口(MPI)广播功能的方法,并通过实现一种改进的广播算法来减少总通信时间。算法利用单源点最短路径思想,考虑网络拓扑结构中的通信成本。
1325

被折叠的 条评论
为什么被折叠?



