BZOJ3835[Poi2014]Supercomputer——斜率优化

最新推荐文章于 2019-03-22 12:57:00 发布

weixin_33781606

最新推荐文章于 2019-03-22 12:57:00 发布

阅读量60

点赞数

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/Khada-Jhin/p/9681408.html

本文探讨了一种新型超级计算机的架构，通过树状结构的程序设计，实现了指令的并行执行。文章深入分析了如何根据不同的处理器单元数量，优化程序的运行时间，提出了最优策略，并提供了具体的算法实现。

题目描述

Byteasar has designed a supercomputer of novel architecture. It may comprise of many (identical) processing units. Each processing unit can execute a single instruction per time unit.

The programs for this computer are not sequential but rather have a tree structure. Each instruction may have zero, one, or multiple subsequent instructions, for which it is the parent instruction.

The instructions of the program can be executed in parallel on all available processing units. Moreover, they can be executed in many orders: the only restriction is that an instruction cannot be executed unless its parent instruction has been executed before. For example, as many subsequent instructions of an instruction that has been executed already can be executed in parallel as there are processing units.

Byteasar has a certain program to run. Since he likes utilizing his resources optimally, he is wondering how the number of processing units would affect the running time. He asks you to determine, for a given program and number of processing units, the minimum execution time of the program on a supercomputer with this many processing units.

给定一棵N个节点的有根树，根节点为1。

Q次询问，每次给定一个K，用最少的操作次数遍历完整棵树，输出最少操作次数。

每次操作可以选择访问不超过K个未访问的点，且这些点的父亲必须在之前被访问过。

输入

In the first line of standard input, there are two integers, N and Q （1<=N,Q<=1 000 000）, separated by a single space, that specify the number of instructions in Byteasar's program and the number of running time queries (for different numbers of processing units).

In the second line of input, there is a sequence of Q integers, K1,k2,…Kq (1<=Ki<=1 000 000), separated by single spaces: Ki is the number of processing units in Byteasar's i-th query.

In the third and last input line, there is a sequence of N-1 integers, A2,A2…An (1<=Ai<i), separated by single spaces: Ai specifies the number of the parent instruction of the instruction number i. The instructions are numbered with successive integers from 1 to N, where the instruction no. 1 is the first instruction of the program.

输出

Your program should print one line consisting of Q integers, separated by single spaces, to the standard output: the i-th of these numbers should specify the minimum execution time of the program on a supercomputer with Ki processing units.

样例输入

20 1
3
1 1 1 3 4 3 2 8 6 9 10 12 12 13 14 11 11 11 11

样例输出

提示

2 3 4

5 6 7

8 10

9 12

11 13 14

15 16 17

18 19 20

最优情况一定是每次选满k个，但这显然不能实现，因此最优策略就是每次尽可能多的选点且保证下一次也能尽可能多的选点。

那么对于每一次选点，能选子节点就选子节点，而不是选完这一层再选下一层，因为只要不到最底层，选子节点至少不会使下一次能选的点数变小。

当往下选不了了再回来选之前剩下的，这样的话前面一些层每层要选一次，后面的层要用size/k次。

能够证明出来合法的最优解是ans=max{i+i/k}，其中i代表深度。

这样求每次询问都是O(n)的显然不行。但可以发现有些i永远不可能成为答案或者如果当前k时不能作为答案之后的k就一定不会成为答案。

因此可以斜率优化成O(n)。只处理出1<=k<=n的k的答案，剩下k>n的答案就是最大深度

#include<set>
#include<map>
#include<queue>
#include<stack>
#include<cmath>
#include<bitset>
#include<vector>
#include<cstdio>
#include<cstring>
#include<iostream>
#include<algorithm>
#define ll long long
using namespace std;
int head[1000010];
int n,Q,x;
int tot;
int k[1000010];
int to[1000010];
int next[1000010];
int dep;
int sum[1000010];
int c[1000010];
int q[1000010];
int ans[1000010];
int l=1,r;
void add(int x,int y)
{
    tot++;
    next[tot]=head[x];
    head[x]=tot;
    to[tot]=y;
}
void dfs(int x,int d)
{
    dep=max(dep,d);
    c[d]++;
    for(int i=head[x];i;i=next[i])
    {
        dfs(to[i],d+1);
    }
}
int calc(int x,int y)
{
    return x/y+(x%y>0);
}
int main()
{
    scanf("%d%d",&n,&Q);
    for(int i=1;i<=Q;i++)
    {
        scanf("%d",&k[i]);
    }
    for(int i=2;i<=n;i++)
    {
        scanf("%d",&x);
        add(x,i);
    }
    dfs(1,1);
    for(int i=dep;i>=0;i--)
    {
        sum[i]=sum[i+1]+c[i+1];
    }
    for(int i=dep;i>=0;q[++r]=i--)
    {
        while(l<r&&1ll*(q[r-1]-q[r])*(sum[i]-sum[q[r]])>=1ll*(q[r]-i)*(sum[q[r]]-sum[q[r-1]]))
        {
            r--;
        }
    }    
    for(int i=n;i>=1;i--)
    {
        while(l<r&&1ll*i*(q[l]-q[l+1])<=1ll*(sum[q[l+1]]-sum[q[l]]))
        {
            l++;
        }
        ans[i]=q[l]+calc(sum[q[l]],i);
    }
    for(int i=1;i<=Q;i++)
    {
        k[i]>n?printf("%d",dep):printf("%d",ans[k[i]]);
        if(i!=Q)
        {
            printf(" ");
        }
    }
}

转载于:https://www.cnblogs.com/Khada-Jhin/p/9681408.html