这篇文章主要讲述,angr中数据流图(Data Flow Gragh)的构建。
DFG恢复的是CFG中每个基本块的数据流!
DFG为CFG的每个基本块构建一个数据流图(DFG)
DFG可以通过字典self.dfgs获得,其中key的值为基本块的地址,或DFG中的值。
param CFG:用于获得所有基本块的CFG
param annocfg:一个由向后片构建的注释cfg,用于在白名单上构建DFG。
构造函数:
def __init__(self, cfg=None, annocfg=None):
"""
Build a Data Flow Grah (DFG) for every basic block of a CFG
The DFGs are available in the dict self.dfgs where the key
is a basic block addr and the value a DFG.
:param cfg: A CFG used to get all the basic blocks
:param annocfg: An AnnotatedCFG built from a backward slice used to only build the DFG on the whitelisted statements
"""
if cfg is None:
self._cfg = self.project.analyses.CFGAccurate()
else:
self._cfg = cfg
self._annocfg = annocfg
self.dfgs = self._construct()
如果没有cfg就构建cfg。
然后,调用_construct()函数构建DFG。这个函数,有点长,不过也是构造数据流的主要函数。下面开始分析吧。
def _construct(self):
"""
We want to build the type of DFG that's used in "Automated Ident. of Crypto
Primitives in Binary Code with Data Flow Graph Isomorphisms." Unlike that
paper, however, we're building it on Vex IR instead of assembly instructions.
"""
cfg = self._cfg
p = self.project
dfgs = {}
l.debug("Building Vex DFG...")
for node in cfg.nodes():#遍历每个节点
try:
if node.simprocedure_name == None:
irsb = p.factory.block(node.addr).vex #根据节点获得irsb
else:
l.debug("Cannot process SimProcedures, ignoring %s" % node.simprocedure_name)
continue
except Exception as e:
l.debug(e)
continue
tmpsnodes = {}
storesnodes = {}
putsnodes = {}
statements = irsb.statements #获取irsb的所有语句
dfg = DiGraph()
for stmt_idx, stmt in enumerate(statements):#遍历每条语句
# We want to skip over certain types, such as Imarks
if self._need_to_ignore(node.addr, stmt, stmt_idx):
continue
# break statement down into sub-expressions