一、什么是序列化和反序列化
把对象转换为字节序列的过程称为对象的序列化;把字节序列恢复为对象的过程称为对象的反序列化。
序列化类必须实现Serializable或Externalizable接口
对象的序列化主要有两种用途:
1、把对象的字节序列永久地保存到硬盘上,通常存放在一个文件中;
2、在网络上传送对象的字节序列。
二、什么是serialVersionUID
序列化运行时每个序列化类与之关联的一个版本号,用于反序列化时确认流中的类与要序列化的类是否是同一个
The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization. If the receiver has loaded a class for the object that has a different serialVersionUID than that of the corresponding sender’s class, then deserialization will result in an InvalidClassException. A serializable class can declare its own serialVersionUID explicitly by declaring a field named
"serialVersionUID"
that must be static, final, and of typelong
:
ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L;
If a serializable class does not explicitly declare a serialVersionUID, then the serialization runtime will calculate a default serialVersionUID value for that class based on various aspects of the class, as described in the Java™ Object Serialization Specification. However, it is strongly recommended that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations, and can thus result in unexpectedInvalidClassException
s during deserialization. Therefore, to guarantee a consistent serialVersionUID value across different java compiler implementations, a serializable class must declare an explicit serialVersionUID value. It is also strongly advised that explicit serialVersionUID declarations use theprivate
modifier where possible, since such declarations apply only to the immediately declaring class–serialVersionUID fields are not useful as inherited members. Array classes cannot declare an explicit serialVersionUID, so they always have the default computed value, but the requirement for matching serialVersionUID values is waived for array classes.
如果不是指定的类,serialVersionUID是一个散列计算类的名称、接口、方法和字段使用安全散列算法(SHA)由美国国家标准定义的。如果类没有实现Serializable接口serialVersionUID默认为0
Java™ Object Serialization Specification
ThelookupAny
method behaves like thelookup
method, except that it returns the descriptor for any class, regardless of whether it implementsSerializable
. TheserialVersionUID
of a class that does not implementSerializable
is 0L.
ThegetSerialVersionUID
method returns theserialVersionUID
of this class. Refer to Section 4.6, "Stream Unique Identifiers." If not specified by the class, the value returned is a hash computed from the class’s name, interfaces, methods, and fields using the Secure Hash Algorithm (SHA) as defined by the National Institute of Standards.
serialVersionUID有两种生成方式:
1、采用这种方式生成的serialVersionUID是1L,例如:
private staticfinal long serialVersionUID =1L;
2、采用这种方式生成的serialVersionUID是根据类名,接口名,方法和属性等来生成的,例如:
private static final long serialVersionUID =4603642343377807741L;
读取类的serialVersionUID的方式:
ObjectStreamClass objectStreamClass = ObjectStreamClass.lookup(User.class);
objectStreamClass.getSerialVersionUID();
显式地定义serialVersionUID有两种用途:
1、 在某些场合,希望类的不同版本对序列化兼容,因此需要确保类的不同版本具有相同的serialVersionUID
2、 在某些场合,不希望类的不同版本对序列化兼容,因此需要确保类的不同版本具有不同的serialVersionUID
三、序列化和反序列化
1.ObjectStreamClass类
序列化的类描述符。包含类的名称和序列化ID,用于特定类在java VM加载时通过lookup方法查找或加载
- Serialization’s descriptor for classes. It contains the name and
- serialVersionUID of the class. The ObjectStreamClass for a specific class
- loaded in this Java VM can be found/created using the lookup method.
通过调用lookup方法将重写的writeObject和readObject方法加载到objectStreamClass中
```java
java.io.ObjectStreamClass#ObjectStreamClass(java.lang.Class<?>)
if (externalizable) {
cons = getExternalizableConstructor(cl);
} else {
cons = getSerializableConstructor(cl);
writeObjectMethod = getPrivateMethod(cl, "writeObject",
new Class<?>[] { ObjectOutputStream.class },
Void.TYPE);
readObjectMethod = getPrivateMethod(cl, "readObject",
new Class<?>[] { ObjectInputStream.class },
Void.TYPE);
readObjectNoDataMethod = getPrivateMethod(
cl, "readObjectNoData", null, Void.TYPE);
hasWriteObjectData = (writeObjectMethod != null);
}
writeReplaceMethod = getInheritableMethod(
cl, "writeReplace", null, Object.class);
readResolveMethod = getInheritableMethod(
cl, "readResolve", null, Object.class);
2.序列化:writeObject()
在调用wroteObject()进行序列化之前会先调用ObjectOutputStream的构造函数生成一个ObjectOutputStream对象,构造函数如下:
public ObjectOutputStream(OutputStream out) throws IOException {
verifySubclass();
// bout表示底层的字节数据容器
bout = new BlockDataOutputStream(out);
handles = new HandleTable(10, (float) 3.00);
subs = new ReplaceTable(10, (float) 3.00);
enableOverride = false;
writeStreamHeader(); // 写入文件头
bout.setBlockDataMode(true); // flush数据
if (extendedDebugInfo) {
debugInfoStack = new DebugTraceInfoStack();
} else {
debugInfoStack = null;
}
}
//往底层字节容器中写入表示序列化的Magic Number以及版本号
protected void writeStreamHeader() throws IOException {
bout.writeShort(STREAM_MAGIC);
bout.writeShort(STREAM_VERSION);
}
接下来就是调用writeObject方法将类写入字节流文件中了,底层实现为:
private void writeObject0(Object obj, boolean unshared) throws IOException {
// ...
try {
// ...
Object orig = obj;
// 获取要序列化的对象的Class对象
Class cl = obj.getClass();
ObjectStreamClass desc;
for (;;) {
Class repCl;
// 创建描述cl的ObjectStreamClass对象
desc = ObjectStreamClass.lookup(cl, true);
// 其他省略代码
}
// ...
// 根据实际的类型进行不同的写入操作
// remaining cases
if (obj instanceof String) {
writeString((String) obj, unshared);
} else if (cl.isArray()) {
writeArray(obj, desc, unshared);
} else if (obj instanceof Enum) {
writeEnum((Enum) obj, desc, unshared);
} else if (obj instanceof Serializable) {
// 被序列化对象实现了Serializable接口
writeOrdinaryObject(obj, desc, unshared);
} else {
if (extendedDebugInfo) {
throw new NotSerializableException(
cl.getName() + "\n" + debugInfoStack.toString());
} else {
throw new NotSerializableException(cl.getName());
}
}
} finally {
depth--;
bout.setBlockDataMode(oldMode);
}
}
private void writeOrdinaryObject(Object obj, ObjectStreamClass desc, boolean unshared) throws IOException {
//...
try {
//验证类是否实现了serialize接口
desc.checkSerialize();
//...
//写入object标志位
bout.writeByte(TC_OBJECT);
//写入类元数据,类名,路径,属性,方法,序列号
writeClassDesc(desc, false);
handles.assign(unshared ? null : obj);
//写入被序列化的对象的实例数据
//Externalizable是继承的Serializable接口,实现了Externalizable就是需要哪个属性就写入哪个属性,默认是不做序列化;需要实现两个方法:writeExternal写入属性方法,readExternal读取属性方法
if (desc.isExternalizable() && !desc.isProxy()) {
writeExternalData((Externalizable) obj);
} else {
writeSerialData(obj, desc);
}
} finally {
//...
}
}
private void writeSerialData(Object obj, ObjectStreamClass desc) throws IOException {
ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
for (int i = 0; i < slots.length; i++) {
ObjectStreamClass slotDesc = slots[i].desc;
if (slotDesc.hasWriteObjectMethod()) {
//...
try {
curContext = new SerialCallbackContext(obj, slotDesc);
bout.setBlockDataMode(true);
//如果类有写WriteObject方法则按照writeObject中的方式将数据写入流中
slotDesc.invokeWriteObject(obj, this);
bout.setBlockDataMode(false);
bout.writeByte(TC_ENDBLOCKDATA);
} finally {
//...
}
//...
} else {
defaultWriteFields(obj, slotDesc);
}
}
}
3.反序列化:readObject()
反序列化流程就是解析字节码文件中的二进制数据,需要注意的是在做反序列化操作时会先解析数据类型标志,再根据数据类型找到对应的class类比较文件中的serialVersionUID与class中的serialVersionUID是否一致,如果不一致则会报文件反序列化失败错误:java.io.InvalidClassException: com.serializable.User; local class incompatible: stream classdesc serialVersionUID = 4316517312099648406, local class serialVersionUID = 6977261730381530552
4.实现单例模式的序列化和反序列化
增加一个readResolve方法返回单例对象,否则就会返回一个新的对象(违反单例原则)(可以先不写readResolve方法测试下序列化前的对象是否与序列化后对象是同一个对象)
public class MySingleton implements Serializable{
String name;
private MySingleton() {
name = "Singleton";
System.out.println("MySingleton is creating");
}
private static final MySingleton INSTANCE = new MySingleton();
public static MySingleton getInstance() { return INSTANCE; }
private Object readResolve() throws ObjectStreamException {
// instead of the object we're on,
// return the class variable INSTANCE
System.out.println("MySingleton read resolve");
return INSTANCE;
}
}
底层实现:
java.io.ObjectInputStream
/**
* Underlying readObject implementation.
*/
private Object readObject0(boolean unshared) throws IOException {
//...(一些省略代码,这些并不是重点)
byte tc;
//读取字节流,获取流类型
while ((tc = bin.peekByte()) == TC_RESET) {
bin.readByte();
handleReset();
}
depth++;
try {
switch (tc) {
//...(一些省略代码,这些并不是重点)
case TC_OBJECT:
return checkResolve(readOrdinaryObject(unshared));
//...(一些省略代码,这些并不是重点)
default:
throw new StreamCorruptedException(String.format("invalid type code: %02X", tc));
}
} finally {
depth--;
bin.setBlockDataMode(oldMode);
}
}
private Object readOrdinaryObject(boolean unshared)
throws IOException
{
if (bin.readByte() != TC_OBJECT) {
throw new InternalError();
}
//获取类元数据
ObjectStreamClass desc = readClassDesc(false);
//验证类是否支持反序列化
desc.checkDeserialize();
Class<?> cl = desc.forClass();
if (cl == String.class || cl == Class.class
|| cl == ObjectStreamClass.class) {
throw new InvalidClassException("invalid class descriptor");
}
//实例化一个类对象
Object obj;
try {
obj = desc.isInstantiable() ? desc.newInstance() : null;
} catch (Exception ex) {
throw (IOException) new InvalidClassException(
desc.forClass().getName(),
"unable to create instance").initCause(ex);
}
passHandle = handles.assign(unshared ? unsharedMarker : obj);
ClassNotFoundException resolveEx = desc.getResolveException();
if (resolveEx != null) {
handles.markException(passHandle, resolveEx);
}
//obj对象属性赋值
if (desc.isExternalizable()) {
readExternalData((Externalizable) obj, desc);
} else {
readSerialData(obj, desc);
}
handles.finish(passHandle);
//如果类中重写了readResolve方法则调用该方法并赋值给obj返回
if (obj != null &&
handles.lookupException(passHandle) == null &&
desc.hasReadResolveMethod())
{
Object rep = desc.invokeReadResolve(obj);
if (unshared && rep.getClass().isArray()) {
rep = cloneArray(rep);
}
if (rep != obj) {
handles.setObject(passHandle, obj = rep);
}
}
return obj;
}
四、static和transient关键字
序列化时,只对对象状态进行了保存,对象方法和类变量等并没有保存,因此序列化并不保存静态变量值,这也就是为什么writeObject方法和readObject 从流中无法拿到用static和transient修饰的变量的原因
private static ObjectStreamField[] getDefaultSerialFields(Class<?> cl) {
Field[] clFields = cl.getDeclaredFields();
ArrayList<ObjectStreamField> list = new ArrayList<>();
//只有属性既不是static又不是transient修饰的才能被序列化
int mask = Modifier.STATIC | Modifier.TRANSIENT;
for (int i = 0; i < clFields.length; i++) {
if ((clFields[i].getModifiers() & mask) == 0) {
list.add(new ObjectStreamField(clFields[i], false, true));
}
}
int size = list.size();
return (size == 0) ? NO_FIELDS :
list.toArray(new ObjectStreamField[size]);
}
五、实现序列化的其它方式
1.是把对象包装成JSON字符串或者xml字符串传输
2.采用谷歌的ProtoBuf
protobuf是google旗下的一款平台无关,语言无关,可扩展的序列化结构数据格式,它定义了一种紧凑得可扩展得二进制协议格式
适合用做数据存储和作为不同应用,不同语言之间相互通信的数据交换格式,只要实现相同的协议格式即同一proto文件被编译成不同的语言版本,加入到各自的工程中去。这样不同语言就可以解析其他语言通过protobuf序列化的数据
3.dubbo RPC 序列化方式:
dubbo序列化:阿里尚未开发成熟的高效java序列化实现,阿里不建议在生产环境使用它
hessian2序列化:hessian是一种跨语言的高效二进制序列化方式。但这里实际不是原生的hessian2序列化,而是阿里修改过的hessianlite,它是dubbo RPC默认启用的序列化方式
json序列化:目前有两种实现,一种是采用的阿里的fastjson库,另一种是采用dubbo中自己实现的简单json库,但其实现都不是特别成熟,而且json这种文本序列化性能一般不如上面两种二进制序列化。
java序列化:主要是采用JDK自带的Java序列化实现,性能很不理想。
六、序列化框架
kryo VS hessian VS Protostuff VSjava
参考文档:深入学习java序列化