性能分析(Performance analysis)
General speedup formula(一般加速比公式)
Speedup=TserialTparallelSpeedup=\frac{T_{serial}}{T_{parallel}}Speedup=TparallelTserial
-
Linear Speedup:Sp=pS_p=pSp=p
-
Superlinear Speedup:Sp>pS_p>pSp>p
-
Usually we get the speedup Sp<pS_p<pSp<p,because parallel programs have overheads.
need extra computation in the parallel vision Communication time between processes
execution time components(执行时间的组成)
Inherently sequential computations(固有的串行部分): σ(n)\sigma(n)σ(n)
Potentially parallel computations(可以并行的部分): φ(n)\varphi (n)φ(n)
Communication operations(并行部分的通信): κ(n,p)\kappa (n,p)κ(n,p)
加速比:

Efficiency(效率)
E=SpeeduppE=\frac{Speedup}{p}E=pSpeedup
Amdahl’s Law
阿曼达定律说的是,如果一个程序包括并行和串行,随着机器数量增加,并行执行时间会越来越短,最后趋向于0,串行的时间没有变,这就是加速比。(计算的时候不考虑开销)

假设固有串行的占比f=σ(n)(σ(n)+φ(n))f =\frac{\sigma(n)}{(\sigma(n) + \varphi(n))}f=(σ(n)+φ(n))σ(n)
0⩽f⩽10\leqslant f \leqslant 10⩽f⩽1
加速比:

算出来的为p个处理器上的最大并行加速比,实际加速比不会超过amdalh‘s Law 算出来的加速比。
例题:
- 95% of a program’s execution time occurs inside a loop that can be executed in parallel. What is the maximum speedup we should expect from a parallel version of the program executing on 8 CPUs?
解析:maximum speedup:amdalh‘s law
95% parallel -> f=0.05


修改:如果删除题目中 of the program executing on 8 CPUs? 即p趋向于无穷。
加速比ψ⩽10.05=20\psi \leqslant \frac{1}{0.05}=20ψ⩽0.051=20 - 20% of a program’s execution time is spent within inherently sequential code. What is the limit to the speedup achievable by a parallel version of the program?
解析:limit to the speedup 极限加速比 用amdalh’s law


固有的串行比例越少,性能越好(大概这样吧)
Limitations of Amdahl’s Law
Ignores κ(n,p)\kappa (n,p)κ(n,p)
Overestimates speedup achievable
Amdahl Effect
Typically κ(n,p)\kappa (n,p)κ(n,p) has lower complexity than φ(n)\varphi (n)φ(n)/p 。通常,κ(n,p)\kappa (n,p)κ(n,p)的复杂度低于 φ(n)\varphi (n)φ(n)/ p
As n increases, φ(n)\varphi (n)φ(n)/p dominatesκ(n,p)\kappa (n,p)κ(n,p)。随着n的增加, φ(n)\varphi (n)φ(n)/ p占κ(n,p)\kappa (n,p)κ(n,p)的主导
As n increases, speedup increases。随着n增加,加速增加
单纯地增加cup处理器的数量并不一定可以有效地提高系统的性能,只有在提高系统内并行化模块比重的前提下,同时合理增加处理器的数量,才能以最小的投入得到最大的加速比
Gustafson-Barsis’s Law(古斯塔夫森定律)
如果将时间作为常数,则问题的规模将随着处理器数量的增加而增加,也就是说,内部串行部件在计算中所占的比例将减少。 因此,加速比将增加。
为解决上述问题,Gustafson定律也是说明处理器个数、串行比例和加速比之前的关系.
如何区分用哪个定律:如果时间锁定,看问题规模——Gustafson-Barsis’s Law
加速比公式:

令s = σ(n)\sigma(n)σ(n)/(σ(n)\sigma(n)σ(n)+φ(n)\varphi (n)φ(n)/p)(串行部分在真正并行情况下所耗费的时间)

In computer architecture, Gustafson’s Law gives the theoretical speedup in latency of the execution of a task at fixed execution time that can be expected of a system whose resources are improved.
Problem size is an increasing function of p
Predicts scaled speedup(可扩展的加速比)
例题:
-
An application running on 10 processors spends 3% of its time in serial code. What is the scaled speedup of the application?

s目前题目中没用再需要计算。

-
What is the maximum fraction of a program’s parallel execution time that can be spent in serial code if it is to achieve a scaled speedup of 7 on 8 processors?

A parallel program executing on 32 processors spends 5% of its time in sequential code. What is the scaled speedup of this program?
sequential code:串行部分 s=0.05
ψ=32+(1−32)s=30.45\psi =32+(1-32)s=30.45ψ=32+(1−32)s=30.45
本文深入探讨了并行计算中的性能分析,包括加速比、效率、Amdahl定律和Gustafson-Barsis定律的应用。通过实例解析,阐述了在不同处理器数量下,固有串行和潜在并行计算部分对加速比的影响。
1173

被折叠的 条评论
为什么被折叠?



