导读:
Java serialization was initially used to support remote method invocation (RMI), allowing argument objects to be passed between two virtual machines.
RMI works best when the two VMs contain compatible versions of the class being transmitted, and can reliably transmit a binary representation of the object based on its internal state. When an object is serialized, it must also serialize the objects to which its fields refer - resulting in what is commonly called an object graph of connected components. Although the transient keyword can be used to control the extent to which the serialization process penetrates the object graph, this level of control is seldom enough.
Many have tried to use Java's serialization to achieve the so-called "long-term persistence" of data - where the serialized form of a Java data structure is written to a file for later use. One such area is the development tools domain, in which designs must be saved for later use. Because the logic that saves and restores serialized objects is based on the internal structure of the constituent classes, any changes to those classes between the time that the object was saved and when it was retrieved may cause the deserialization process to fail outright; for example, a field was added or removed, existing fields were renamed or reordered, or the class's superclass or package was altered. Such changes are to be expected during the development process, and any mechanism that relies on the internal structure of all classes being identical between versions to work has the odds stacked against it. Over the last few years the "versioning issues" associated with Java's serialization mechanism have indeed proved to be insurmountable and have led to widespread abandonment of Java's serialization as a viable long-term persistence strategy in the development tools space.
To tackle Java serialization problems, a Java Specification Request (JSR 57) was created, titled "Long-Term Persistence for JavaBeans." JSR 57 is included in JRE 1.4 and is part of the "java.beans" package. This article describes the mechanism with which the JSR solved the problems of long-term persistence, and how you can take control of the way that the XMLEncoder generates archives to represent the data in your application.
We'll start our section by dispelling two popular myths that have grown up around XML serialization: that it can only be used for JavaBeans and that all JavaBeans are GUI widgets. In fact, the XMLEncoder can support any public Java class; these classes don't have to be JavaBeans and they certainly don't have to be GUI widgets. The only constraint that the encoder places on the classes it can archive is that there must be a means to create and configure each instance through public method calls. If the class implements the getter/setter paradigm of the JavaBeans specification, the encoder can acheive its goal automatically - even for a class it knows nothing about. On top of this default behavior, the XMLEncoder comes with a small but very powerful API that allows it to be "taught" how to save instances of any class - even if they don't use any of the JavaBeans design patterns. In fact, most of the Swing classes deviate from the JavaBeans specification in some way and yet the XMLEncoder handles them via a set of rules with which it comes preconfigured. The XMLEcoder is currently spec'ed to provide automatic support for all subclasses of Component in the SDK and all of their property types (recursively). This means that as well as being able to serialize all of AWT and Swing GUI widgets, the XMLEncoder can also serialize: primitive values (int, double, etc.), strings, dates, arrays, lists, hashtables (including all Collection classes), and many other classes that you might not think of as having anything to do with JavaBeans. The support for all these classes is not "hard-wired" into the XMLEncoder; instead it is provided to the Encoder through the API that it exposes for general use. The variety in the APIs among even the small subset of classes mentioned earlier should give some idea of the generality and scope of the persistence techniques we will cover in the next sections.
Background
When problems are encountered with an object stream, they're hard to correct because the format is binary. An XML document is human readable, and therefore easier for a user to examine and manipulate when problems arise. To serialize objects to an XML document, use the class java.beans.XMLEncoder; to read objects, use the class java.beans.XMLDecoder.
One reason object streams are brittle is that they rely on the internal shape of the class remaining unchanged between encoding and decoding. The XMLEncoder takes a completely different approach here: instead of storing a bit-wise representation of the field values that make up an object's state, the XMLEncoder stores the steps necessary to create the object through its public API. There are two key factors that make XML files written this way remarkably robust when compared with their serialized counterparts.
First, many changes to a class's internal implementation can be made while preserving backward compatibility in its public APIs. In public libraries, this is often a requirement of new releases - as breaking a committed public API would break all the third-party code that had used the library in its older form. As a result of this, many software vendors have internal policies that prevent its developers from knowingly "breaking" any of the public APIs in new releases. While exceptions inevitably arise, they are on a much, much smaller scale than the internal changes that are made to the private implementations of the classes within the library. In this way, the XMLDecoder derives much of its resilience to versioning by aligning its requirements with those of developers who program against APIs directly.
The second reason for the stability of the decoding process as implemented by the XMLDecoder is just as important. If you were to take an instance of any class, choose an arbitrary member variable, and set it to null - the behavior of that instance would be completely undefined in all subsequent operations - and a bug-free implementation would be entitled to fail catastrophically under these circumstances. This is exactly what happens when a field is added to a new version of a class and this causes people to cross their fingers when trying to deserialize an instance of a class that was written out with an older version. The XMLEncoder, by contrast, doesn't store a list of private fields but a program that represents the object's state. Here's an XML file representing a window with the title "Test":
XML archives, written by XMLEncoder, have exactly the same information as a Java program - they're just written using an XML encoding rather than a Java one. Here's what the above program would look like in Java:
JFrame f = new JFrame();
f.setTitle("Test");
f.setVisible(true);
When a backward compatibility issue arises in one of the classes in the archive, it may cause one of the earlier statements to fail. A new version of the class might, for example, choose not to define the "setTitle()" method. When this happens, the XMLDecoder detects that this method is now missing from the class and doesn't try to call it. Instead, it issues a warning, ignores the offending statement, and continues with the other statements in the file. The critical point is that not calling the "setTitle()" method does not violate the contract of the implementation (as deleting an instance variable would), and the resulting instance should be a valid and fully functional Java object. If the resulting Java object fails in any way, an ordinary Java program could be written against its API to demonstrate a genuine bug in its implementation.
The vendors of popular Java libraries tend to devote significant resources toward programs to manage demonstrable bugs of this kind and enlist the support of the development community to work toward their eradication - Sun's "BugParade" is a well-known example. As a result of these kinds of programs, bugs that can be demonstrated by simple "setup code" tend to be rare in mature libraries. Once again, the XMLDecoder benefits here as it's able to ride on the coattails of the Java developer by using the public APIs of the classes instead of relying on special privileges to circumvent them.
Encoding of JavaBeans
To illustrate the XMLEncoder, this article shows serialization based on a number of scenarios using an example Person class. These range from simple JavaBeans encoding through nondefault construction and custom initialization.
In the simplest scenario, the class Person has String fields for firstName and lastName, together with get and set methods.
public class Person {
private String firstName;
private String lastName;
public String getFirstName() { return firstName; }
public String getLastName() { return lastName; }
public void setFirstName(String str) { firstName = str; }
public void setLastName(String str) { lastName = str; }
}
The following code creates an encoder and serializes a Person.
FileOutputStream os = new FileOutputStream("C:/cust.xml");
XMLEncoder encoder = new XMLEncoder(os);
Person p = new Person();
p.setFirstName("John");
encoder.writeObject(p);
encoder.close();
The XML file created shows that Person class has been encoded, and that its firstName property is the string "John".
When the file is decoded with the XMLDecoder, the Person class will be instantiated with its default constructor, and the firstName property set by calling the method setFirstName("John").
FileInputStream os = new FileInputStream("C:/cust.xml");
XMLDecoder decoder = new XMLDecoder(os);
Person p = (Person)decoder.readObject();
decoder.close();
To understand how to leverage the encoder and decoder for custom serialization requires an understanding of the JavaBeans component model. This describes a class's interface in terms of a set of properties, each of which can have a get and set method. To determine the set of operations required to re-create an object, the XMLEncoder creates a prototype instance using its default constructor and then compares the value of each property between this and the object being serialized. If any of the values don't match, the encoder adds it to the graph of objects to be serialized, and so on until it has a complete set of the objects and properties required to re-create the original object being serialized. When the encoder reaches objects that can't be broken down any further, such as Java's strings, ints, or doubles, it writes these values directly to the XML document as tag values. For a complete list of these primitive values and their associated tags, see http://java.sun.com/products/jfc/tsc/ articles/persistence3/index.html.
To serialize an object, XMLEncoder uses the Strategy pattern, and delegates the logic to an instance of java.beans.PersistenceDelegate. The persistence delegate is given the object being serialized and is responsible for determining which API methods can be used to re-create the same instance in the VM in which it will be decoded. The XMLEncoder then executes the API to create the prototype instance that it gives to the delegate, together with the original object being serialized, so the delegate can determine the API methods to re-create the nondefault state.
The method XMLEncoder.setPersistenceDelegate(Class objectClass, PersistenceDelegate delegate) is used to set a customized delegate for an object class. To illustrate this we'll change the original Person class so that it no longer conforms to the standard JavaBeans model, and show how persistence delegates can be used to teach the XMLEncoder to successfully serialize each instance.
Constructor Arguments
One of the patterns that can be taught to the XMLEncoder is how to create an instance where there is no zero-argument constructor. The following is an example of this in which a Person must be constructed with its firstName and lastName as arguments.
public Person(String aFirstName, String aLastName){
firstName = aFirstName;
firstName = aLastName;
}
In the absence of any customized delegate, the XMLEncoder uses the class java.beans.DefaultPersistenceDelegate. This expects the instance to conform to the JavaBeans component model with a zero-argument constructor and JavaBeans properties controlling its state. For the Person whose property values are supplied as constructor arguments, an instance of DefaultPersistenceDelegate can be created with the list of property names that represent the constructor arguments.
XMLEncoder e = new XMLEncoder(os);
Person p = new Person("John","Smith");
e.setPersistenceDelegate(Person.class,
new DefaultPersistenceDelegate(
new String[] { "firstName","lastName"}
);
e.writeObject(person);
When the XMLEncoder creates the XML for the Person object, it uses the supplied instance of the DefaultPersistenceDelegate, queries the values of the firstName and lastProperties, and creates the following XML document.
The result is a record of the Object's state but written in such a way that the XMLDecoder can locate and call the public constructor of the Person object just as a Java program would. In the previous XML document where the Person was a standard JavaBeans component, the nondefault properties were specified with named tags that contained the argument values.
Although custom encoding rules can be supplied to the XMLEncoder, this is not true of the XMLDecoder. The XML document represents the API steps to re-create the serialized objects in a target VM. One advantage of not having custom decoder rules is that only the environment that serializes the objects requires customization, whereas the target environment just requires the classes with unchanged APIs. This makes it ideal for the following scenario - serialization of an object graph within a development tool that has access to design-time customization, where the XML document will be read in a runtime environment that does not have access to the persistence delegates used during encoding.
Custom Instantiation
In addition to a class being constructed with property values as arguments, custom instantiation can include use of factory methods. An example of this would be if Person's constructor were package protected and instances of the Person class could only be created by calling a static createPerson() method defined in a PersonFactory class.
To write a persistence delegate requires a basic understanding of how the encoder creates its set of operations that will re-create the serialized objects when the stream is deserialized. The XMLEncoder uses the command pattern to record each of the required method calls as instances of the class java.beans.Statement. Each Statement represents an API call in which a method is sent to a target, together with any arguments. Commands that are responsible for the instantiation of objects are instances of java.beans.Expression. A subclass of Statement returns a value. Each object in the graph is represented by the Expression that creates it and a set of Statements that are used to initialize it.
For general control of instantiation, a subclass of the PersistenceDelegate class should be created with a specialized instantiate() method. The return value is the java.beans.Expression that indicates to the encoder which method or constructor should be used to create (or retrieve) the object. The returned Expression includes the object, the target (normally the class that defines the constructor), the method name (normally the fake name "new," which indicates a constructor call), and the argument values that the method or constructor takes.
The first argument of the instantiate() method is the instance of the Person object being serialized, and the second object is the encoder (see Listing 1).
When the XMLEncoder serializes the Person instance, instead of the DefaultPersistenceDelegate that uses standard JavaBeans rules for properties, it uses the anonymous inner class we registered as the persistence delegate of the Person.class. The resulting XML follows. In the

Java serialization was initially used to support remote method invocation (RMI), allowing argument objects to be passed between two virtual machines.
RMI works best when the two VMs contain compatible versions of the class being transmitted, and can reliably transmit a binary representation of the object based on its internal state. When an object is serialized, it must also serialize the objects to which its fields refer - resulting in what is commonly called an object graph of connected components. Although the transient keyword can be used to control the extent to which the serialization process penetrates the object graph, this level of control is seldom enough.
Many have tried to use Java's serialization to achieve the so-called "long-term persistence" of data - where the serialized form of a Java data structure is written to a file for later use. One such area is the development tools domain, in which designs must be saved for later use. Because the logic that saves and restores serialized objects is based on the internal structure of the constituent classes, any changes to those classes between the time that the object was saved and when it was retrieved may cause the deserialization process to fail outright; for example, a field was added or removed, existing fields were renamed or reordered, or the class's superclass or package was altered. Such changes are to be expected during the development process, and any mechanism that relies on the internal structure of all classes being identical between versions to work has the odds stacked against it. Over the last few years the "versioning issues" associated with Java's serialization mechanism have indeed proved to be insurmountable and have led to widespread abandonment of Java's serialization as a viable long-term persistence strategy in the development tools space.
To tackle Java serialization problems, a Java Specification Request (JSR 57) was created, titled "Long-Term Persistence for JavaBeans." JSR 57 is included in JRE 1.4 and is part of the "java.beans" package. This article describes the mechanism with which the JSR solved the problems of long-term persistence, and how you can take control of the way that the XMLEncoder generates archives to represent the data in your application.
We'll start our section by dispelling two popular myths that have grown up around XML serialization: that it can only be used for JavaBeans and that all JavaBeans are GUI widgets. In fact, the XMLEncoder can support any public Java class; these classes don't have to be JavaBeans and they certainly don't have to be GUI widgets. The only constraint that the encoder places on the classes it can archive is that there must be a means to create and configure each instance through public method calls. If the class implements the getter/setter paradigm of the JavaBeans specification, the encoder can acheive its goal automatically - even for a class it knows nothing about. On top of this default behavior, the XMLEncoder comes with a small but very powerful API that allows it to be "taught" how to save instances of any class - even if they don't use any of the JavaBeans design patterns. In fact, most of the Swing classes deviate from the JavaBeans specification in some way and yet the XMLEncoder handles them via a set of rules with which it comes preconfigured. The XMLEcoder is currently spec'ed to provide automatic support for all subclasses of Component in the SDK and all of their property types (recursively). This means that as well as being able to serialize all of AWT and Swing GUI widgets, the XMLEncoder can also serialize: primitive values (int, double, etc.), strings, dates, arrays, lists, hashtables (including all Collection classes), and many other classes that you might not think of as having anything to do with JavaBeans. The support for all these classes is not "hard-wired" into the XMLEncoder; instead it is provided to the Encoder through the API that it exposes for general use. The variety in the APIs among even the small subset of classes mentioned earlier should give some idea of the generality and scope of the persistence techniques we will cover in the next sections.
Background
When problems are encountered with an object stream, they're hard to correct because the format is binary. An XML document is human readable, and therefore easier for a user to examine and manipulate when problems arise. To serialize objects to an XML document, use the class java.beans.XMLEncoder; to read objects, use the class java.beans.XMLDecoder.
One reason object streams are brittle is that they rely on the internal shape of the class remaining unchanged between encoding and decoding. The XMLEncoder takes a completely different approach here: instead of storing a bit-wise representation of the field values that make up an object's state, the XMLEncoder stores the steps necessary to create the object through its public API. There are two key factors that make XML files written this way remarkably robust when compared with their serialized counterparts.
First, many changes to a class's internal implementation can be made while preserving backward compatibility in its public APIs. In public libraries, this is often a requirement of new releases - as breaking a committed public API would break all the third-party code that had used the library in its older form. As a result of this, many software vendors have internal policies that prevent its developers from knowingly "breaking" any of the public APIs in new releases. While exceptions inevitably arise, they are on a much, much smaller scale than the internal changes that are made to the private implementations of the classes within the library. In this way, the XMLDecoder derives much of its resilience to versioning by aligning its requirements with those of developers who program against APIs directly.
The second reason for the stability of the decoding process as implemented by the XMLDecoder is just as important. If you were to take an instance of any class, choose an arbitrary member variable, and set it to null - the behavior of that instance would be completely undefined in all subsequent operations - and a bug-free implementation would be entitled to fail catastrophically under these circumstances. This is exactly what happens when a field is added to a new version of a class and this causes people to cross their fingers when trying to deserialize an instance of a class that was written out with an older version. The XMLEncoder, by contrast, doesn't store a list of private fields but a program that represents the object's state. Here's an XML file representing a window with the title "Test":
XML archives, written by XMLEncoder, have exactly the same information as a Java program - they're just written using an XML encoding rather than a Java one. Here's what the above program would look like in Java:
JFrame f = new JFrame();
f.setTitle("Test");
f.setVisible(true);
When a backward compatibility issue arises in one of the classes in the archive, it may cause one of the earlier statements to fail. A new version of the class might, for example, choose not to define the "setTitle()" method. When this happens, the XMLDecoder detects that this method is now missing from the class and doesn't try to call it. Instead, it issues a warning, ignores the offending statement, and continues with the other statements in the file. The critical point is that not calling the "setTitle()" method does not violate the contract of the implementation (as deleting an instance variable would), and the resulting instance should be a valid and fully functional Java object. If the resulting Java object fails in any way, an ordinary Java program could be written against its API to demonstrate a genuine bug in its implementation.
The vendors of popular Java libraries tend to devote significant resources toward programs to manage demonstrable bugs of this kind and enlist the support of the development community to work toward their eradication - Sun's "BugParade" is a well-known example. As a result of these kinds of programs, bugs that can be demonstrated by simple "setup code" tend to be rare in mature libraries. Once again, the XMLDecoder benefits here as it's able to ride on the coattails of the Java developer by using the public APIs of the classes instead of relying on special privileges to circumvent them.
Encoding of JavaBeans
To illustrate the XMLEncoder, this article shows serialization based on a number of scenarios using an example Person class. These range from simple JavaBeans encoding through nondefault construction and custom initialization.
In the simplest scenario, the class Person has String fields for firstName and lastName, together with get and set methods.
public class Person {
private String firstName;
private String lastName;
public String getFirstName() { return firstName; }
public String getLastName() { return lastName; }
public void setFirstName(String str) { firstName = str; }
public void setLastName(String str) { lastName = str; }
}
The following code creates an encoder and serializes a Person.
FileOutputStream os = new FileOutputStream("C:/cust.xml");
XMLEncoder encoder = new XMLEncoder(os);
Person p = new Person();
p.setFirstName("John");
encoder.writeObject(p);
encoder.close();
The XML file created shows that Person class has been encoded, and that its firstName property is the string "John".
When the file is decoded with the XMLDecoder, the Person class will be instantiated with its default constructor, and the firstName property set by calling the method setFirstName("John").
FileInputStream os = new FileInputStream("C:/cust.xml");
XMLDecoder decoder = new XMLDecoder(os);
Person p = (Person)decoder.readObject();
decoder.close();
To understand how to leverage the encoder and decoder for custom serialization requires an understanding of the JavaBeans component model. This describes a class's interface in terms of a set of properties, each of which can have a get and set method. To determine the set of operations required to re-create an object, the XMLEncoder creates a prototype instance using its default constructor and then compares the value of each property between this and the object being serialized. If any of the values don't match, the encoder adds it to the graph of objects to be serialized, and so on until it has a complete set of the objects and properties required to re-create the original object being serialized. When the encoder reaches objects that can't be broken down any further, such as Java's strings, ints, or doubles, it writes these values directly to the XML document as tag values. For a complete list of these primitive values and their associated tags, see http://java.sun.com/products/jfc/tsc/ articles/persistence3/index.html.
To serialize an object, XMLEncoder uses the Strategy pattern, and delegates the logic to an instance of java.beans.PersistenceDelegate. The persistence delegate is given the object being serialized and is responsible for determining which API methods can be used to re-create the same instance in the VM in which it will be decoded. The XMLEncoder then executes the API to create the prototype instance that it gives to the delegate, together with the original object being serialized, so the delegate can determine the API methods to re-create the nondefault state.
The method XMLEncoder.setPersistenceDelegate(Class objectClass, PersistenceDelegate delegate) is used to set a customized delegate for an object class. To illustrate this we'll change the original Person class so that it no longer conforms to the standard JavaBeans model, and show how persistence delegates can be used to teach the XMLEncoder to successfully serialize each instance.
Constructor Arguments
One of the patterns that can be taught to the XMLEncoder is how to create an instance where there is no zero-argument constructor. The following is an example of this in which a Person must be constructed with its firstName and lastName as arguments.
public Person(String aFirstName, String aLastName){
firstName = aFirstName;
firstName = aLastName;
}
In the absence of any customized delegate, the XMLEncoder uses the class java.beans.DefaultPersistenceDelegate. This expects the instance to conform to the JavaBeans component model with a zero-argument constructor and JavaBeans properties controlling its state. For the Person whose property values are supplied as constructor arguments, an instance of DefaultPersistenceDelegate can be created with the list of property names that represent the constructor arguments.
XMLEncoder e = new XMLEncoder(os);
Person p = new Person("John","Smith");
e.setPersistenceDelegate(Person.class,
new DefaultPersistenceDelegate(
new String[] { "firstName","lastName"}
);
e.writeObject(person);
When the XMLEncoder creates the XML for the Person object, it uses the supplied instance of the DefaultPersistenceDelegate, queries the values of the firstName and lastProperties, and creates the following XML document.
The result is a record of the Object's state but written in such a way that the XMLDecoder can locate and call the public constructor of the Person object just as a Java program would. In the previous XML document where the Person was a standard JavaBeans component, the nondefault properties were specified with named tags that contained the argument values.
Although custom encoding rules can be supplied to the XMLEncoder, this is not true of the XMLDecoder. The XML document represents the API steps to re-create the serialized objects in a target VM. One advantage of not having custom decoder rules is that only the environment that serializes the objects requires customization, whereas the target environment just requires the classes with unchanged APIs. This makes it ideal for the following scenario - serialization of an object graph within a development tool that has access to design-time customization, where the XML document will be read in a runtime environment that does not have access to the persistence delegates used during encoding.
Custom Instantiation
In addition to a class being constructed with property values as arguments, custom instantiation can include use of factory methods. An example of this would be if Person's constructor were package protected and instances of the Person class could only be created by calling a static createPerson() method defined in a PersonFactory class.
To write a persistence delegate requires a basic understanding of how the encoder creates its set of operations that will re-create the serialized objects when the stream is deserialized. The XMLEncoder uses the command pattern to record each of the required method calls as instances of the class java.beans.Statement. Each Statement represents an API call in which a method is sent to a target, together with any arguments. Commands that are responsible for the instantiation of objects are instances of java.beans.Expression. A subclass of Statement returns a value. Each object in the graph is represented by the Expression that creates it and a set of Statements that are used to initialize it.
For general control of instantiation, a subclass of the PersistenceDelegate class should be created with a specialized instantiate() method. The return value is the java.beans.Expression that indicates to the encoder which method or constructor should be used to create (or retrieve) the object. The returned Expression includes the object, the target (normally the class that defines the constructor), the method name (normally the fake name "new," which indicates a constructor call), and the argument values that the method or constructor takes.
The first argument of the instantiate() method is the instance of the Person object being serialized, and the second object is the encoder (see Listing 1).
When the XMLEncoder serializes the Person instance, instead of the DefaultPersistenceDelegate that uses standard JavaBeans rules for properties, it uses the anonymous inner class we registered as the persistence delegate of the Person.class. The resulting XML follows. In the