Data Structure Analysis (DSA)
Full Paper: [Data Structure Analysis: A Fast and Scalable Context-Sensitive Heap Analysis (2003)]
DSA算法(DataStructure Analysis的首字母缩写)是LLVM的发起人Chris Latter在其硕士、博士系列论文中提出的一个上下文感知(context sensitivity)的、过程间(inter-procedure)的数据结构分析算法。这个算法的强大之处在于可以分析像C这样拥有指针类型的复杂语言,并拥有可观的效率(在http://llvm.org/pubs/可以找到这篇论文“DataStructure Analysis: An Efficient Context-Sensitive Heap Analysis”)。
DSA算法的实现目前在llvm的poolalloc项目下(poolalloc的源代码可通过SVN从http://llvm.org/svn/llvm-project/poolalloc/trunk获取),poolalloc是应用DSA的一个强大的分配池框架。ChrisLatter的论文对此有详尽的描述。据注释显示,DSA的实现尚未稳定,还在剧烈改动中。
DSA算法在llvm的中间表达形式(llvm-IR)的基础上实现,这个中间表达形式的特点是保存了尽可能多的类型信息。而这是DSA能够实现的重要条件(llvm-IR的详尽说明可以参考http://llvm.org/docs/LangRef.html)。
Summary
This paper describes a scalable heap analysis algorithm, Data Structure Analysis, designed to enable analyses and transformations of programs at the level of entire logical data structures. Data Structure Analysis attempts to identify disjoint instances of logical program data structures and their internal and external connectivity properties (without trying to categorize their “shape”). To achieve this, Data Structure Analysis is fully context-sensitive (in the sense that it names memory objects by entire acyclic call paths), is field sensitive, builds an explicit model of the heap, and is robust enough to handle the full generality of C. Despite these aggressive features, the algorithm is both extremely fast (requiring 2-7 seconds for C programs in the range of 100K lines of code) and is scalable in practice. It has three features we believe are novel: (a) it incrementally builds a precise program call graph during the analysis; (b) it distinguishes complete and incomplete information in a manner that simplifies analysis of libraries or other portions of programs; and © it uses speculative field-senstivity in type unsafe programs in order to preserve efficiency and scalability. Finally, it shows that the key to achieving scalability in a fully context-sensitive algorithm is the use of a unification based approach, a combination that has been used before but whose importance has not been clearly articulated.
1. 概要
别名分析

DSA算法,由ChrisLatter提出,是一种高效、上下文感知的数据结构分析算法,特别适用于C/C++等复杂语言。本文深入介绍了DSA算法的原理,包括其三个核心特性:精确的程序调用图构建、完整与不完整信息的区分以及对类型安全程序的推测字段敏感性。DSA算法通过创建数据结构图(DS图)来分析和转换复杂数据结构,如列表、堆或图的不相交实例。算法分为三个阶段:局部图构建、自底向上分析和自顶向下分析。实验结果显示,DSA算法在实际应用中表现出色,处理大型程序时速度快、内存消耗低。
最低0.47元/天 解锁文章
5550

被折叠的 条评论
为什么被折叠?



