并行程序设计导论——2.6性能分析

最新推荐文章于 2024-01-07 15:16:41 发布

原创最新推荐文章于 2024-01-07 15:16:41 发布 · 1.2k 阅读

2 ·

CC 4.0 BY-SA版权

并行程序设计导论专栏收录该内容

7 篇文章

订阅专栏

本文深入探讨了并行计算中的性能分析，包括加速比、效率、Amdahl定律和Gustafson-Barsis定律的应用。通过实例解析，阐述了在不同处理器数量下，固有串行和潜在并行计算部分对加速比的影响。

性能分析（Performance analysis）

General speedup formula(一般加速比公式）

$Speedup=TserialTparallelSpeedup=\frac{T_{serial}}{T_{parallel}}$

Linear Speedup: $S_p=p$
Superlinear Speedup: $S_p>p$

Usually we get the speedup $S_p<p$ ,because parallel programs have overheads.

need extra computation in the parallel vision
Communication time between processes

execution time components(执行时间的组成）

Inherently sequential computations（固有的串行部分）: $σ(n)\sigma(n)$
Potentially parallel computations（可以并行的部分）: $φ(n)\varphi (n)$
Communication operations（并行部分的通信）: $κ(n,p)\kappa (n,p)$
加速比：
在这里插入图片描述

Efficiency（效率）

$E=SpeeduppE=\frac{Speedup}{p}$

Amdahl’s Law

阿曼达定律说的是，如果一个程序包括并行和串行，随着机器数量增加，并行执行时间会越来越短，最后趋向于0，串行的时间没有变，这就是加速比。（计算的时候不考虑开销）
在这里插入图片描述
假设固有串行的占比 $=\frac{\sigma(n)}{(\sigma(n) + \varphi(n))}$
$0⩽f⩽10\leqslant f \leqslant 1$
加速比：

算出来的为p个处理器上的最大并行加速比，实际加速比不会超过amdalh‘s Law 算出来的加速比。

例题：

95% of a program’s execution time occurs inside a loop that can be executed in parallel. What is the maximum speedup we should expect from a parallel version of the program executing on 8 CPUs?
解析：maximum speedup：amdalh‘s law
95% parallel -> f=0.05

修改：如果删除题目中 of the program executing on 8 CPUs? 即p趋向于无穷。
加速比 $ψ⩽10.05=20\psi \leqslant \frac{1}{0.05}=20$
20% of a program’s execution time is spent within inherently sequential code. What is the limit to the speedup achievable by a parallel version of the program?
解析：limit to the speedup 极限加速比用amdalh’s law

固有的串行比例越少，性能越好（大概这样吧）

Limitations of Amdahl’s Law

Ignores $κ(n,p)\kappa (n,p)$
Overestimates speedup achievable

Amdahl Effect

Typically $κ(n,p)\kappa (n,p)$ has lower complexity than $φ(n)\varphi (n)$ /p 。通常， $κ(n,p)\kappa (n,p)$ 的复杂度低于 $φ(n)\varphi (n)$ / p
As n increases, $φ(n)\varphi (n)$ /p dominates $κ(n,p)\kappa (n,p)$ 。随着n的增加， $φ(n)\varphi (n)$ / p占 $κ(n,p)\kappa (n,p)$ 的主导
As n increases, speedup increases。随着n增加，加速增加

单纯地增加cup处理器的数量并不一定可以有效地提高系统的性能，只有在提高系统内并行化模块比重的前提下，同时合理增加处理器的数量，才能以最小的投入得到最大的加速比

Gustafson-Barsis’s Law（古斯塔夫森定律）

如果将时间作为常数，则问题的规模将随着处理器数量的增加而增加，也就是说，内部串行部件在计算中所占的比例将减少。因此，加速比将增加。
为解决上述问题，Gustafson定律也是说明处理器个数、串行比例和加速比之前的关系.
如何区分用哪个定律：如果时间锁定，看问题规模——Gustafson-Barsis’s Law

加速比公式：
在这里插入图片描述
令s = $σ(n)\sigma(n)$ /( $σ(n)\sigma(n)$ + $φ(n)\varphi (n)$ /p)(串行部分在真正并行情况下所耗费的时间）

在这里插入图片描述
In computer architecture, Gustafson’s Law gives the theoretical speedup in latency of the execution of a task at fixed execution time that can be expected of a system whose resources are improved.

Problem size is an increasing function of p
Predicts scaled speedup（可扩展的加速比）
例题：

An application running on 10 processors spends 3% of its time in serial code. What is the scaled speedup of the application?

s目前题目中没用再需要计算。
What is the maximum fraction of a program’s parallel execution time that can be spent in serial code if it is to achieve a scaled speedup of 7 on 8 processors?

A parallel program executing on 32 processors spends 5% of its time in sequential code. What is the scaled speedup of this program?
sequential code：串行部分 s=0.05
$ψ=32+(1−32)s=30.45\psi =32+(1-32)s=30.45$