一、前言
上一节我们对Python编译及反汇编做了讲解,大家知道dis模块可以将编译好的pyc文件中提取出来的PyCodeObject反汇编为可以阅读字节码形式。本节我们对dis模块中的源码进行详细的解读。
二、dis模块原理解析
官方文档说明:https://docs.python.org/2/library/dis.html
The dis module supports the analysis of CPython bytecode by disassembling it. The CPython bytecode which this module takes as an input is defined in the file Include/opcode.h and used by the compiler and the interpreter.
dis模块通过反汇编来支持对python字节码形式的分析。dis模块可以将编译好的二进制数据或者python源码当作模块的输入源。
dis模块可以将python源码文件、内存中的类或者方法、或者经过反序列化的PyCodeObject翻译为相应的字节码供分析。
2.1、dis反汇编源码文件:
将源码文件作为dis模块的输入,dis模块将直接输入该源码文件编译后对应的字节码文本。
2.2、dis反汇编内存中的类或者函数:
将内存中的类、函数,甚至时普通的变量作为参数传递给dis模块中的dis函数,也可以返回该类对应的编译后的字节码形式。
2.3、dis反汇编PyCodeObject对象:
这一类情况是我们在做python逆向或者pyc文件分析时常用到的形式。
2.4、dis无参数:
如果dis.dis无参数传入,该方法默认会返回当前python shell上次报错时堆栈中储存的内存信息的字节码形式。
三、dis模块解读
dis模块包含许多类和方法,具体用法如下表:
方法或者属性
说明
dis.dis([bytesource])
Disassemble the bytesource object. bytesource can denote either a module, a class, a method, a function, or a code object. For a module, it disassembles all functions. For a class, it disassembles all methods. For a single code sequence, it prints one line per bytecode instruction. If no object is provided, it disassembles the last traceback.
dis.distb([tb])
Disassembles the top-of-stack function of a traceback, using the last traceback if none was passed. The instruction causing the exception is indicated.
dis.disassemble(code[, lasti])
Disassembles a code object, indicating the last instruction if lasti was provided.
dis.disco(code[, lasti])
A synonym for disassemble(). It is more convenient to type, and kept for compatibility with earlier Python releases.
dis.findlinestarts(code)
This generator function uses the co_firstlineno and co_lnotab attributes of the code obj