Please indicate the source if you want to reprint: http://blog.youkuaiyun.com/gaoxiangnumber1.
5.1 Capabilities and Limitations of Optimizing Compilers
Compilers must be careful to apply only safe optimizations to a program, meaning that the resulting program will have the exact same behavior as would an unoptimized version for all possible cases the program may encounter.
twiddle2 is more efficient: it requires only three memory references (read *xp, read *yp, write *xp), whereas twiddle1 requires six (two reads of *xp, two reads of *yp, and two writes of *xp).
When xp and yp are equal. Then function twiddle1 will perform the following computations:
The result will be that the value at xp will be increased by a factor of 4. On the other hand, function twiddle2 will perform the following computation:
The result will be that the value at xp will be increased by a factor of 3. The compiler knows nothing about how twiddle1 will be called, and so it must assume that arguments xp and yp can be equal. It therefore cannot generate code in the style of twiddle2 as an optimized version of twiddle1.(memory alias)
Code involving function calls can be optimized by a process known as inline substitution (or simply “inlining”), where the function call is replaced by the code for the body of the function.
5.2 Expressing Program Performance
We introduce the metric cycles per element(CPE) as a way to express program performance.
The sequencing of activities by a processor is controlled by a clock providing a regular signal of some frequency, usually expressed in gigahertz (GHz), billions of cycles per second. A “4 GHz” processor’s processor clock runs at 4.0 × 109 cycles per second.
The time required by such a procedure can be characterized as a constant plus a factor proportional to the number of elements processed.
We refer to the coefficients as the effective number of cycles per element(“CPE”). By this measure, psum2, with a CPE of 6.50, is superior to psum1, with a CPE of 10.0.
5.3 Program Example
5.4 Eliminating Loop Inefficiencies
5.5 Reducing Procedure Calls
5.6 Eliminating Unneeded Memory References
Please indicate the source if you want to reprint: http://blog.youkuaiyun.com/gaoxiangnumber1.
Chapter 5-01
最新推荐文章于 2024-08-26 13:30:35 发布