Java Basic -- Serialization and I/0

最新推荐文章于 2019-09-18 15:24:51 发布

原创最新推荐文章于 2019-09-18 15:24:51 发布 · 681 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#Java基础 #序列化与反序列化 #I/O流 #SerialVersionUID

TAG:靠谱的Java工程师专栏收录该内容

5 篇文章

订阅专栏

本文深入探讨了Java中对象的状态保存与恢复，包括序列化与反序列化的实现原理，如何通过序列化保存对象状态，以及如何在不同会话间恢复对象状态。文章还介绍了序列化过程中需要注意的问题，如版本控制、类结构变化的影响，以及如何使用serialVersionUID确保兼容性。

Outline

Basics

      Objects can be flattened and inflated. Objects have state and behavior. Behavior lives in the class,but state lives within each individual object. So what happens when it’s time to save the state of an object? If you’re writing a game, you’re gonna need a Save/Restore Game feature. If you’re writing an app that creates charts, you’re gonna need a Save/Open File feature. If your program needs to save state, you can do it the hard way, interrogating each object, then painstakingly writing the value of each instance to a file, in a formate you create. Or, you can do it the easy OO way – you simply flatten the object itself, and inflate it to get it back. But you’ll still have to do it the hard way sometimes, especially when the file your app saves has to be read by some other non-Java application.
      You have lots of options for how to save the state of your Java program, and what you choose will probably depend on how you plan to use the saved state.
      If your data will be used by only the Java program that generated it: Use serialization – Write a file that holds flattened (Serialized) objects. Then have your program read the serialized objects from the file and inflate them back into living, breathing, heap-inhabiting objects.
      If your data will be used by other programs:Write a plain text file – Write a file, with delimiters that other programs can parse. For example, a tab-delimited file that a spreadsheet or database application can use.
      These aren’t the only options, of course. You can save data in any formate you choose. Instead of writing charaters, for example, you can write your data as bytes. But regardless of the method you use, the fundamental I/0 techniques are pretty much the same: write some data to something, and usually the something is either a file on disk or a stream coming from a network connection. Reading the data is the same process in reverse: read some data from either a file on disk or a network connection. And of course everthing we talk about in this part is for times when you aren’t using an actual database.

Demo – Saving State

Imagine you have a program, say, a fantasy adventure game, that takes more than one session to complete. As the game progresses, characters in the game become stronger, weaker, smarter, etc., and gather and use weapons. You don’t want to start from scratch each time you launch the game. So, you need a way to save the state of the characters, and a way to restore the state when you resume the game.
Imagine you have three game characters to save …

Serialization

Writing a serialized object to a file


//Step 1
/* Make a FileOutputStream */
/* 
 Note: If the file "MyGame.ser" doesn't exist, it will be created automatically.
 Note: FileOutputStream knows how to connect to (and create) a file. 
*/
FileOutputStream fileStream = new FileOutputStream("MyGame.ser");

//Setp 2
/* Make an ObjectOutputStream */
/*
 Note: ObjectOutputStream lets you write objects, but it can't directly connect to a file. It need to be fed a 'helper'. This is actually called 'chaining' one stream to another.
*/
ObjectOutputStream os = new ObjectOutputStream(fileStream);

//Step 3
/* Write the object */
os.writeObject(characterOne);
os.writeObject(characterTwo);
os.writeObject(characterThree);

//Step 4
/* Close the ObjectOutputStream */
/*
 Note: Closing the stream at the top closes the ones underneath, so the FileOutputStream (and the file) will close automatically.
*/
os.close();

Data Moves in streams from one place to another

      The Java I/O API has connection streams, that represent connections to destinations and sources such as files on disk or network sockets, and chain streams that work only if chained to other streams.
      Often, it takes at least two streams hooked together to do something useful – one to represent the connection and another to call methods on. Why two? Because connection streams are usually too low-level. FileOutputStream (a connection stream), for example, has methods for writing bytes. But we don’t want to write bytes! We want to write objects, so we need a high-level chain stream.
      OK, then why not have just a single stream that does exactly what you want? One that lets you write objects but underneath converts them to bytes? Think good OO. Each class does one thing well. FileOutputStream write bytes to a file. ObjectOutputStream turn objects into data that can be written to a stream. So we make a FileOutputStream that lets us write to a file, and we hook an ObjectOutputStream (a chain stream) on the end of it. When we call writeObject() on the ObjectOutputStream, the objects gets pumbed into the stream and then moves to the FileOutputStream where it ultimately gets written as bytes to a file.
      The ability to mix and match different combinations of connection and chain streams gives you tremendous flexibility!

What really happens to an object when it’s serialized?
What exactly IS an object’s state? What needs to be saved?

      Now it starts to get interesting. Easy enough to save the primitive values 37 and 70. But what if an object has an instance variable that’s an object reference? What about an object that has five instance variables that are object references? What if those object instance variables themselves have instance variables?
      Think about it. What part of an object is potentially unique? Imagine what nees to be restored in order to get an object that’s idential to the one that was saved. It will have different memory location, of course, but we don’t care about that. All we care about is that out there on the heap, we’ll get an object that has the same state that the object had when it was saved.
      When an object is serialized, all the objects it refers to from instance variables are also serialized. And all the objects those objects refer to are serialized. And all the objects those objects refer to are serialized… and the best part is, it happens automatically.
      Serialization saves the entire object graph. All objects referenced by instance variables start with the object being serialized.

If you want your class to be serializable, implement Serializable.

The Serializable interface is known as a marker or tag interface, because the interface doesn’t have any methods to implement. Its sole purpose is to announce that the class implementing it is , well, serializable. In other words, objects of that type are savable through the serialization mechanism. If any superclass of a class is serializable, the subclass is automatically serializable even if the subclass doesn’t explicitly declare implements Serializable.


/* 
 Note: whatever goes here (the argument characterOne) MUST implement Serializable or it will fail at runtime 
 */
os.writeObject(characterOne);

Serialization is all or nothing.

Either the entire object graph is serialized correctly or serialization fails. You can’t serialize a object if its instance variable refuses to be serialized (by not implementing Serializable).
If you want an instance variable to be skipped by the serialization process, mark the variable with the transient keyword.


class Game implements Serializable {
	/* transient says, "don't save this variable during serialization, just  skip it."  */
	transient String currentID; 
	/* name varibale will be saved as part of the object's state during serialization. */
	String name;
}

Deserialization

The whole point of serializing an object is so that you can restore it back to its original state at some later date, in a different ‘run’ of the JVM (which might not even be the same JVM that was running at the time the object was serialized). Deserialization is a lot like serialization in reverse.


//Step 1
/* Make a FileInputStream */
/*
 Note: If the file "MyGame.ser" doesn't exist, you'll get an exception.
 Note: The FileInputStream knows how to connect to an existing file.
*/
FileInputStream fileStream = new FileInputStream("MyGame.ser");

//Step 2
/* Make an ObjectInputStream */
/*
 Note: ObjectInputStream lets you read objects, but it can't directly connect to a file. It needs to be chained to a connection stream, in this case a FileInputStream.
*/
ObjectInputStream os = new ObjectInputStream(fileStream);

//Step 3
/* Read the objects */
/*
 Note: Each time you say readObject(), you get the next object in the stream. So you'll read them back in the same order in which they were written. You'll get a big fat exception if you try to read more objects than you wrote.
*/
Object one = os.readObjct();
Object two = os.readObjct();
Object three = os.readObjct();

//Step 4
/* Cast the objects */
/*
 Note: The return value of readObject() is type Object, so you have to cast it back to the type you know it really is.
*/
GameCharacter elf = (GameCharacter)one;
GameCharacter troll = (GameCharacter)two;
GameCharacter magician = (GameCharacter)three;

//Step 5
/* Close the ObjectInputStream */
/*
 Note: Closing the stream at the top closes the ones underneath, so the FileInputStream (and the file) will close automatically.
*/
os.close();

What happens during deserialization?

      When an object is deserialized, the JVM attempts to bring the object back to life by making a new object on the heap that has the same state the serialized object had at the time it was serialized. Well, except for the transient variables, which come back either null (for object references) or as default primitive values.

      1)The object is read from the stream;
      2)The JVM determines (through info stored with the serialized object) the object’s class type.
      3)The JVM attempts to find and load the object’s class. If the JVM can’t find and/or load the class, the JVM throws an exception and the deserialization fails.
      4)A new object is given space on the heap, but the serialized object’s constructor does not run! Obviously, if the constructor ran, it would restore the state of the object back to its original ‘new’ state, and that’s not what we want. We want the object to be restored to the state it had when it was serialized, not when it was first created.
      5)If the object has a non-serializable class somewhere up its inheritance tree, the constructor for that non-serializable class will run along with any constructors above that (even if they’re serializable). Once the constructor chaining begins, you can’t stop it, which means all superclasses, begining with the first non-serializable one will reinitialize their state.
      6)The object’s instance variables are given the values from the serialized state.Transient variables are given a value of null for object references and defaults (0, false. etc.) for primitives.

SerialVersionUID

Version Control is crucial

If you serialize an object, you must have the class in order to deserialize and use the object. OK, that’s obvious. But what might be less obvious is what happens if you change the class in the meantime? Yikes. Imagine trying to bring back a Dog object when one of its instance variables (non-transient) has changed from a double to a String. That violates Java’s type-safe sensibilities in a Big Way. But that’s not the only change that might hurt compatibility.

Changes to a class that can hurt deserialization:
Deleting an instance variable
Changing the declared type of an instance variable
Changing a non-transient instance variable to transient
Moving a class up or down the inheritance hierarchy
Changing a class (anywhere in the object graph) from Serializable to not Serializable (by removing ‘implements Serializable’ from a class declaration)
Changing an instance variable to static

Changes to a class that are usually OK:
Adding new instance variables to the class (existing objects will deserialize with default values for the instance variables they didn’t have when they were serialized)
Adding classes to the inheritance tree
Removing classes from the inheritancne tree
Changing the access level of an instance variable has no affect on the ability of deseriablization to assign a value to the variable
Changing an instance variable from transient to non-transient (previously-serialized objects will simply have a default value for the previously-transient variables)

Using the serialVersionUID

      Each time an object is serialized, the object (including every object in its graph) is ‘stamped’ with a version ID number for the object’s class. The ID is called the serialVersionUID, and it’s computed based on information about the class structure. As an object is being deserialized, if the class has changed since the object was serialized, the class would have a different serialVersionUID, and deserialization will fail. But you can control this.
      If you think there is ANY possibility that your class might evolve, put a serial version ID in your class.
      When Java tries to deserialize an object, it compares the serialized object’s serialVersionUID with that of the class the JVM is using for deserializing the object. For example, if a Dog instance was serialized with an ID of, say 23 (in reality a serialVersionUID is much longer), when the JVM
deserializes the Dog object it will first compare the Dog object serialVersionUID with the Dog class serialVersionUID. If the two numbers don’t match, the JVM assumes the class is not compatible with the previously-serialized object, and you’ll get an exception during deserialization.
      So, the solution is to put a serialVersionUID in your class, and then as the class evolves, the serialVersionUID will remain the same and the JVM will say, “OK, cool, the class is compatible with this serialized object.” even though the class has actually changed.
      This works only if you’re careful with your class chagnes! In other words, you are taking responsibilty for any issues that come up when an older object is brought back to life with a newer class.

I/O

Writing a String to a Text File

Writing text data (a String, actually) is similar to writing an object except you write a String instead of an Object, and you use a FileWriter instead of a FileOutputStream (and you don’t chain it to an ObjectOutputStream).


/*
 Note: All the I/O stuff must be in a try/catch. Everything can throw an exception.
*/
try {
/* If the file "Ftd.txt" doesn't exist, FileWriter will create it. */
FileWriter writer = new FileWriter("Ftd.txt");
writer.write("Hello world!");
writer.close();
} catch(IOException ex) {
	ex.printStackTrace();
}

The java.io.File class

The java.io.File class represents a file on disk, but doesn’t actually represent the contents of the file. What? Think of a File object as something more like a pathname of a file (or even a directory) rather than The Actual File Itself. The File class does not, for example, have methods for reading and writing. One VERY useful thing about a File object is that it offers a much safer way to represent a file than just using a String file name. For example, most classes that take a String file name in their constructor (like FileWriter or FileInputStream) can take a File object instead. You can construct a File object, verify that you’ve got a valid path, etc. and then give that File object to the FileWriter or FileInputStream.
A File object represents the name and path of a file or directory on disk, for example: /Users/dqs/Desktop/Ftd.txt. But it does NOT represent, or give you access to, the data in the file.

The beauty of buffers

If there were no buffers, it would be like shopping without a cart. You’d have to carry each thing out to your car, one soup can or toilet paper roll at a time.

BufferedWriter writer = new BufferedWriter(new File("Ftd.txt"));

The cool thing about buffers is that they’re much more efficient than working without them. You can write to a file using FileWriter alone, by calling write(something), but FileWriter writes each and everything you pass to the file each and every time. That’s overhead you don’t want or need, since every trip to the disk is a Big Deal compared to manipulating data in memory. By chaining a BufferdWritter onto a FileWriter, the BufferedWriter will hold all the stuff you write to it until it’s full. Only when the buffer is full will the FileWriter actually be told to write to the file on disk.
If you want to send data before the buffer is full, you do have control. Just Flush It. Calls to writer.flush() say, “send whatever’s in the buffer, now”.

Reading from a Text File

Reading text from a file is simple, but this time we’ll use a File object to represent the file, a FileReader to do the actual reading and a BufferedReader to make the reading more efficient.
The read happens by reading lines in a while loop, ending the loop when the result of a readLine() is null. That’s the most common style for reading data (pretty much anything that’s not a Serialized object): read stuff in a while loop, terminating when there’s nothing left to read (which we know because the result of whatever read method we’re using is null).

try{
	/* A FileReader is a connection stream for characters, that connects to a text file. */
	FileReader fileReader = new FileReader(new File("Ftd.txt"));
	/* Chain the FileReader to a BufferedReader for more efficient reading. */
	BufferedReader reader = new BufferedReader(fileReader);
	/* Make a String variable to hold each line as the line is read. */
	String line = null;
	while ((line = reader.readLine()) != null) {
		  /*
			This says, "Read a line of text, and assign it to the String variable 'line'. While that variable is not null (because there WAS something to read) print out the line that was just read."
			Or another way of saying it, "While there are still lines to read, read them and print them."
		  */
		  System.out.println(line);
	}
	reader.close();
} catch(IOException ex) {
	ex.printStackTrace();
}