- 为什么用llvm::make_unique?
make_unique是C++14中才加入的,C++11中只有make_shared
循环优化
- 把循环不变式的计算移到循环外部
- 消除额外的归纳变量
循环不变量代码外移的过程:
FOR index := 1 TO 10000 DO t := y * z
BEGIN FOR index := 1 TO 10000 DO
x := y * z ; BEGIN
j := index * 3 ; --> x := t
END j := index * 3
. END
---------------------
原文:https://blog.youkuaiyun.com/qq_29674357/article/details/78564033
尾调用优化
概念:某个函数的最后一步是调用另一个函数。
function f(x){
return g(x);
}
以下两种情况,都不属于尾调用。
// 情况一:调用函数g之后,还有别的操作,所以不属于尾调用,即使语义完全一样
function f(x){
let y = g(x);
return y;
}
// 情况二:也属于调用后还有操作,即使写在一行内。
function f(x){
return g(x) + 1;
}
尾调用由于是函数的最后一步操作,所以不需要保留外层函数的调用记录,因为调用位置、内部变量等信息都不会再用到了,只要直接用内层函数的调用记录,取代外层函数的调用记录就可以了。
- 针对尾递归的情况
递归非常耗费内存,因为需要同时保存成千上百个调用记录,很容易发生"栈溢出"错误(stack overflow)。但对于尾递归来说,由于只存在一个调用记录,所以永远不会发生"栈溢出"错误。
比较两段求斐波那契数列的函数:
function factorial(n) {
if (n === 1) return 1;
return n * factorial(n - 1);
}
factorial(5) // 120
function factorial(n, total) {
if (n === 1) return total;
return factorial(n - 1, n * total);
}
factorial(5, 1) // 120
SSA
https://blog.youkuaiyun.com/qq_29674357/article/details/78731713
- 为什么要使用 SSA ?
第一,可以简化很多编译优化方法的过程;第二,对很多编译优化方法来说,可以获得更好的优化结果。
https://wiki.aalto.fi/display/t1065450/LLVM+SSA 介绍llvm IR例子
LLVM Type
- @ and % in LLVM
Global identifiers (functions, global variables) begin with the ‘@’ character. Local identifiers (register names, types) begin with the ‘%’ character. - Function Type
The function type can be thought of as a function signature. It consists of a return type and a list of formal parameter types. The return type of a function type is a void type or first class type — except for label and metadata types.
Syntax:
()
Examples:
i32 (i32) function taking an i32, returning an i32
float (i16, i32 *) * Pointer to a function that takes an i16 and a pointer to i32, returning float.
i32 (i8*, ...) A vararg function that takes at least one pointer to i8 (char in C), which returns an integer. This is the signature for printf in LLVM.
{i32, i32} (i32) A function taking an i32, returning a structure containing two i32 values
- Integer Type
The integer type is a very simple type that simply specifies an arbitrary bit width for the integer type desired. Any bit width from 1 bit to 223-1 (about 8 million) can be specified.
Syntax:
iN
i1 a single-bit integer.
i32 a 32-bit integer.
i1942652 a really big integer of over 1 million bits.
- Floating-Point Types
Type Description
half 16-bit floating-point value
float 32-bit floating-point value
double 64-bit floating-point value
fp128 128-bit floating-point value (112-bit mantissa)
x86_fp80 80-bit floating-point value (X87)
ppc_fp128 128-bit floating-point value (two 64-bits)
The binary format of half, float, double, and fp128 correspond to the IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128 respectively.
- Array Type
Syntax:
[<# elements> x <#elementtype> ]
The number of elements is a constant integer value; elementtype may be any type with a size.
Examples:
[40 x i32] Array of 40 32-bit integer values.
[41 x i32] Array of 41 32-bit integer values.
[4 x i8] Array of 4 8-bit integer values.
Here are some examples of multidimensional arrays:
[3 x [4 x i32]] 3x4 array of 32-bit integer values.
[12 x [10 x float]] 12x10 array of single precision floating-point values.
[2 x [3 x [4 x i16]]] 2x3x4 array of 16-bit integer values.
- Vector Type
Syntax:
< <# elements> x <#elementtype> >
<4 x i32> Vector of 4 32-bit integer values.
<8 x float> Vector of 8 32-bit floating-point values.
<2 x i64> Vector of 2 64-bit integer values.
<4 x i64*> Vector of 4 pointers to 64-bit integer values.
- Poison Value
考虑到溢出会带来不确定的行为
%add = add nsw i32 %x, 1
相当于if (%x+1) overflows then %add = undef else %add = add i32 %x,1
https://stackoverflow.com/questions/34190997/the-poison-value-and-undefined-value-in-llvm
LLVM IR
-
alloca
https://stackoverflow.com/questions/45507294/llvm-ir-alloca-instruction
The alloca instruction reserves space on the stack frame of the current function. The amount of space is determined by element type size, and it respects a specified alignment. The first instruction, %a.addr = alloca i32, align 4, allocates a 4-byte stack element, which respects a 4-byte alignment. A pointer to the stack element is stored in the local identifier, %a.addr. The alloca instruction is commonly used to represent local (automatic) variables.</