class文件是JVM的输入,是JVM实现平台无关、技术无关的基础。java虚拟机规范中定义了class文件的结构。
- class文件格式概述
我们先看下classFile的结构(查阅java 虚拟机规范)
ClassFile{
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 infterfaces_count;
u2 infterfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_couunt];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
在ClassFile结构中,各项的含义描述如下:
magic(魔数)minor_version(副版本号)、major_version(主版本号)
constant_pool_count(常量池计数器)
constant_pool[](常量池)
access_flag(访问标识)
this_class(类索引)
super_class(父类索引)
infterfaces_count(接口计数器)
infterfaces[](接口表)
fields_count(字段计数器)
fields[](字段表)
methods_count(方法计数器)
methods[](方法表)
attributes_count(属性计数器)
attributes[](属性表)
二进制字节码如下:
-
Class文件是一组以8字节为单位的字节流,各个数据项目按顺序紧凑排列(如上图开头的字符CA为一组,CA用二进制表示为1100 1010,共8字节)
-
对于占用空间大于8字节的数据项,按照高位在前的方式分割成多个8字节进行存储
-
Class文件格式里面只有两种类型:无符号数、表
-
无符号数:基本数据类型,以u1、u2、u4、u8来代表几个字节的无符号数(如classfile第一行的u4
magic;对应二进制字节码的CA FE BA BE) -
表:由多个无符号数和其他表构成的复合数据类型,通常以"_info"结尾
-
阅读class文件
创建如下代码:
public class Test {
private static String params = "Hello World";
public static void main(String[] args) {
System.out.println("params = " + params);
}
}
通过反编译(javap)查看生成的class文件:
Classfile /F:/crazy/freedom/ifd-jvm/target/classes/cn/crazy/dreamer/Test.class
Last modified 2020-10-12; size 861 bytes
MD5 checksum 03940161a851df384ccc9ab26db9046c
Compiled from "Test.java"
public class cn.crazy.dreamer.Test
minor version: 0
major version: 52 // 与jdk有关
flags: ACC_PUBLIC, ACC_SUPER // 访问标识public,super
Constant pool: // 常量池
#1 = Methodref #12.#30 // java/lang/Object."<init>":()V
#2 = Fieldref #31.#32 // java/lang/System.out:Ljava/io/PrintStream;
#3 = Class #33 // java/lang/StringBuilder
#4 = Methodref #3.#30 // java/lang/StringBuilder."<init>":()V
#5 = String #34 // params =
#6 = Methodref #3.#35 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#7 = Fieldref #11.#36 // cn/crazy/dreamer/Test.params:Ljava/lang/String;
#8 = Methodref #3.#37 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#9 = Methodref #38.#39 // java/io/PrintStream.println:(Ljava/lang/String;)V
#10 = String #40 // Hello World
#11 = Class #41 // cn/crazy/dreamer/Test
#12 = Class #42 // java/lang/Object
#13 = Utf8 params
#14 = Utf8 Ljava/lang/String;
#15 = Utf8 <init>
#16 = Utf8 ()V
#17 = Utf8 Code
#18 = Utf8 LineNumberTable
#19 = Utf8 LocalVariableTable
#20 = Utf8 this
#21 = Utf8 Lcn/crazy/dreamer/Test;
#22 = Utf8 main
#23 = Utf8 ([Ljava/lang/String;)V
#24 = Utf8 args
#25 = Utf8 [Ljava/lang/String;
#26 = Utf8 MethodParameters
#27 = Utf8 <clinit>
#28 = Utf8 SourceFile
#29 = Utf8 Test.java
#30 = NameAndType #15:#16 // "<init>":()V
#31 = Class #43 // java/lang/System
#32 = NameAndType #44:#45 // out:Ljava/io/PrintStream;
#33 = Utf8 java/lang/StringBuilder
#34 = Utf8 params =
#35 = NameAndType #46:#47 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#36 = NameAndType #13:#14 // params:Ljava/lang/String;
#37 = NameAndType #48:#49 // toString:()Ljava/lang/String;
#38 = Class #50 // java/io/PrintStream
#39 = NameAndType #51:#52 // println:(Ljava/lang/String;)V
#40 = Utf8 Hello World
#41 = Utf8 cn/crazy/dreamer/Test
#42 = Utf8 java/lang/Object
#43 = Utf8 java/lang/System
#44 = Utf8 out
#45 = Utf8 Ljava/io/PrintStream;
#46 = Utf8 append
#47 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#48 = Utf8 toString
#49 = Utf8 ()Ljava/lang/String;
#50 = Utf8 java/io/PrintStream
#51 = Utf8 println
#52 = Utf8 (Ljava/lang/String;)V
{
public cn.crazy.dreamer.Test(); // 无参构造
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 11: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcn/crazy/dreamer/Test;
public static void main(java.lang.String[]); // main方法
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=3, locals=1, args_size=1
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: new #3 // class java/lang/StringBuilder
6: dup
7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
10: ldc #5 // String params =
12: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: getstatic #7 // Field params:Ljava/lang/String;
18: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
21: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
24: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
27: return
LineNumberTable:
line 16: 0
line 17: 27
LocalVariableTable:
Start Length Slot Name Signature
0 28 0 args [Ljava/lang/String;
MethodParameters:
Name Flags
args
static {}; // 静态变量初始化,在jvm静态变量初始化都是通过静态块实现
descriptor: ()V
flags: ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: ldc #10 // String Hello World
2: putstatic #7 // Field params:Ljava/lang/String;
5: return
LineNumberTable:
line 13: 0
}
SourceFile: "Test.java"
我们通过反编译生成的classFile是jdk工具将二进制的.class文件根据java虚拟机规范解析成我们开发人员比较易于阅读的格式。
javap工具生成非正式的"虚拟机汇编语言",格式如下:
[index][opcode][operand1][operand2…]][comment]
- [index]是指令操作码在数组中的下标,该数组以字节形式来存储当前方法的java虚拟机代码;也可以是相对于方法起始处的字节偏移量(对应上面代码中的 0)
- [opercode]是指令的助记码(对应上面代码中的 ldc)
- [opercoderand]是操作数(对应上面代码中的 #10)
- [comment]是行尾的注释(对应上面代码中的 // String Hello World)
阅读class文件时需要注意:
- constant_pool_count:是从1开始的
- 不同的常量类型,用tag来区分的,它后面对应的info结构是不一样的
- L表示对象,[表示数组、V表示void
- stack:方法执行时操作栈的深度
- locals:局部变量所需的存储空间,单位是slot(slot是虚拟机为局部变量分配内存所使用的最小单位)
- args_size:参数个数,为1的话,因实例方法默认会传入this,local也会预留一个slot来存放
以如下classfile为例,我们大概来阅读下:
static {};
descriptor: ()V
flags: ACC_STATIC
Code: stack=1, locals=0, args_size=0
0: ldc #10 // String Hello World
2: putstatic #7 // Field params:Ljava/lang/String;
5: return
LineNumberTable:
line 13: 0
ldc #10:装载常量hello world(#10表示在常量池中的编号)putstatic #7:赋值给params
我们粗略阅读下main方法反编译生成的代码:
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=3, locals=1, args_size=1
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: new #3 // class java/lang/StringBuilder
6: dup
7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
10: ldc #5 // String params =
12: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: getstatic #7 // Field params:Ljava/lang/String;
18: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
21: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
24: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
27: return
LineNumberTable:
line 16: 0
line 17: 27
LocalVariableTable:
Start Length Slot Name Signature
0 28 0 args [Ljava/lang/String;
MethodParameters:
Name Flags
args
getstatic #2:拿到PrintStream
new #3:创建stringbuild对象
dup:复制
invokespecial #4:调用初始化方法初始化stringbuild对象
ldc #5:装载常量
invokevirtual #6: 调用方法
getstatic #7:获取静态变量
-
使用IDEA执行javap
每次执行javap反编译命令时,都要进入.class文件的目录,相对来说比较麻烦,我们可以通过idea来运行javap
依次点击"Settrings"->“Tools”->“External Tools”,点击"+"添加一个外部工具
进入编辑页面,按如图所示进行配置:
在java文件上点击右键(前提是java文件已经编译完成),按如图所示进行操作,即可在idea命令窗口输出class信息。
class字节码规范在java虚拟机规范中有详细描述,建议到官网中下载官方文档进行阅读(https://docs.oracle.com/javase/specs/index.html),也可以购买JVM方面的书籍(Java 虚拟机规范[爱飞翔 周志明 等译])