LLVM 物理寄存器的表示
LLVM中是如何定义,描述物理寄存器的?在回答这个问题之前,我们先看看与物理寄存器有关数据结构的定义,由于这些定义主要写在td文件中,我们先用table-gen输出完整定义。
table-gen 生成register的信息
在下面的介绍中,我们会用到table-gen的数据,这里先列出相关数据生成命令。
$llvm-tblgen --gen-register-info -I ./llvm/lib/Target/X86 -I ./llvm/include -
I ./llvm/lib/Target ./llvm/lib/Target/X86/X86.td -o outreg.inf
$llvm-tblgen --gen-register-bank -I ./llvm/lib/Target/X86 -I ./llvm/include -
I ./llvm/lib/Target ./llvm/lib/Target/X86/X86.td -o outreg.bank
$llvm-tblgen -I ./llvm/lib/Target/X86 -I ./llvm/include -
I ./llvm/lib/Target ./llvm/lib/Target/X86/X86.td -o out.info
认识MCRegisterInfo
LLVM中主要通过MCRegisterInfo对物理寄存器相关信息进行访问和遍历。
class MCRegisterInfo {
const MCRegisterDesc *Desc; // Pointer to the descriptor array
unsigned NumRegs; // Number of entries in the array
MCRegister RAReg; // Return Address register
MCRegister PCReg; // Program Counter register
const MCRegisterClass *Classes; // Pointer to the regclass array
unsigned NumClasses; // Number of entries in the array
unsigned NumRegUnits; // Number of regunits.
const MCPhysReg (*RegUnitRoots)[2]; // Pointer to regunit root table.
const MCPhysReg *DiffLists; // Pointer to the difflists array
const LaneBitmask *RegUnitMaskSequences; // Pointer to lane mask sequences
// for register units.
const char *RegStrings; // Pointer to the string table.
const char *RegClassStrings; // Pointer to the class strings.
const uint16_t *SubRegIndices; // Pointer to the subreg lookup
// array.
// struct SubRegCoveredBits { uint16_t Offset; uint16_t Size; };
const SubRegCoveredBits *SubRegIdxRanges; // Pointer to the subreg covered
// bit ranges array.
unsigned NumSubRegIndices; // Number of subreg indices.
const uint16_t *RegEncodingTable; // Pointer to array of register
// encodings.
...
public:
/// DiffListIterator - Base iterator class that can traverse the
/// differentially encoded register and regunit lists in DiffLists.
/// Don't use this class directly, use one of the specialized sub-classes
/// defined below.
class DiffListIterator {
uint16_t Val = 0;
const MCPhysReg *List = nullptr;
}
...
}
如何描述寄存器
每个寄存器的描述被放在一张Desc表里,一共有NumRegs个寄存器。
class MCRegisterInfo {
const MCRegisterDesc *Desc; // Pointer to the descriptor array
unsigned NumRegs; // Number of entries in the array
...}
单个寄存器描述信息如下:
struct MCRegisterDesc {
// extern const char X86RegStrings[] = {
// /* 0 */ "XMM10\0"
// /* 6 */ "YMM10\0"
// /* 12 */ "ZMM10\0" ... }
uint32_t Name; // 在X86RegStrings字符串表中的偏移,比如YMM10为6
uint32_t SubRegs; // Sub-register set (在DiffLists中的偏移)
uint32_t SuperRegs; // Super-register set (在DiffLists中的偏移)
// Offset into MCRI::SubRegIndices of a list of sub-register indices for each
// sub-register in SubRegs.
uint32_t SubRegIndices; // 该Reg的SubRegIndices在 MCRI::SubRegIndices 中的偏移
// RegUnits - Points to the list of register units.
// 低 4 bits 存储 Scale, 其它 bits 表示在 DiffLists 中的偏移. See MCRegUnitIterator.
uint32_t RegUnits;
/// Index into list with lane mask sequences. The sequence contains a lanemask
/// for every register unit.
uint16_t RegUnitLaneMasks;
};
X86后端如何初始化MCRegisterInfo
在outreg.inf中我们发现有一段初始化MCRegisterInfo 的代码,以此我们很容易将table-gen中生成的一些表项与MCRegisterInfo中的成员对应起来:
static inline void InitX86MCRegisterInfo(MCRegisterInfo *RI, unsigned RA,
unsigned DwarfFlavour = 0,
unsigned EHFlavour = 0,
unsigned PC = 0) {
RI->InitMCRegisterInfo(X86RegDesc, 292, RA, PC, X86MCRegisterClasses, 126,
X86RegUnitRoots, 173, X86RegDiffLists, X86LaneMaskLists,
X86RegStrings, X86RegClassStrings,
X86SubRegIdxLists, 11, X86SubRegIdxRanges, X86RegEncodingTable);
...
}
Reg 与 RegUnit
Reg
Reg就是我们平常在后端td文件中定义的寄存器 (X86: X86RegisterInfo.td)
如RCX,其主要通过X86Reg类来定义。
lib/Target/X86/X86RegisterInfo.td
164 let SubRegIndices = [sub_32bit] in {
def RCX : X86Reg<"rcx", 1, [ECX]>, DwarfRegNum<[2, -2, -2]>;
}
Reg的enum 值通按照其字符串加数字的排列顺序确定:
namespace llvm {
namespace X86 {
enum {
NoRegister,
AH = 1,
AL = 2, .