350 likes | 595 Views
Object Serialization in Java. Or: The Persistence of Memory …. Originally from: http://www.cs.unm.edu/~terran/. So you want to save your data…. Common problem: You’ve built a large, complex object Spam/Normal statistics tables Game state Database of student records Etc…
Object Serialization in Java Or: The Persistence of Memory… Originally from: http://www.cs.unm.edu/~terran/
So you want to save your data… • Common problem: • You’ve built a large, complex object • Spam/Normal statistics tables • Game state • Database of student records • Etc… • Want to store on disk and retrieve later • Or: want to send over network to another Java process • In general: want your objects to be persistent -- outlive the current Java process
Answer I: customized file formats • Write a set of methods for saving/loading each instance of a class that you care about public class MyClass { public void saveYourself(Writer o) throws IOException { ... } public staticMyClassloadYourself(Reader r) throws IOException { ... } }
Coolnessesof Approach 1: • Can produce arbitrary file formats • Know exactly what you want to store and get back/don’t store extraneous stuff • Can build file formats to interface with other codes/programs • XML • Pure text file • Etc. • If your classes are nicely hierarchical, makes saving/loading simple • What will happen with Inheritance?
Make Things Saveable/Loadable public interface Saveable { public void saveYourself(Writer w) throws IOException; // should also have this // public static Object loadYourself(Reader r) // throws IOException; // but you can’t put a static method in an // interface in Java }
Saving, cont’d public class MyClassA implements Saveable { public MyClassA(int arg) { // initialize private data members of A } public void saveYourself(Writer w) throws IOException { // write MyClassA identifier and private data on // stream w } public static MyClassA loadYourself(Reader r) throws IOException { // parse MyClassA from the data stream r MyClassA tmp=new MyClassA(data); return tmp; } }
Saving, cont’d public class MyClassB implements Saveable { public void MyClassB(intarg) { ... } private MyClassA _stuff; public void saveYourself(Writer w) { // write ID for MyClassB _stuff.saveYourself(w); // write other private data for MyClassB w.flush(); } public static MyClassBloadYourself(Reader r) { // parse MyClassB ID from r MyClassAtmp=MyClassA.loadYourself(r); // parse other private data for MyClassB return new MyClassB(tmp); } }
Painfulnesses of Approach 1: • This is called recursive descent parsing • Actually, there are plenty of places in the real world where it’s terribly useful. • But... It’s also a pain (why?) • If all you want to do is store/retrieve data, do you really need to go to all of that effort? • Fortunately, no. Java provides a shortcut that takes a lot of the work out.
Approach 2: Enter Serialization... • Java provides the serialization mechanism for object persistence • It essentially automates the grunt work for you • Short form: public class MyClassAimplements Serializable{ ... } // in some other code elsewhere... MyClassAtmp=new MyClassA(arg); FileOutputStreamfos=new FileOutputStream("some.obj"); ObjectOutputStream out=new ObjectOutputStream(fos); out.writeObject(tmp); out.flush(); out.close();
In a bit more detail... • To (de-)serialize an object, it must implements Serializable • All of its data members must also be marked serializable • And so on, recursively... • Primitive types (int, char, etc.) are all serizable automatically • So are Strings, most classes in java.util, etc. • This saves/retrieves the entire object graph, including ensuring uniqueness of objects
The object graph and uniqueness Entry MondoHashTable "tyromancy" Vector Entry "zygopleural"
Now some problems… • static fields are not automatically serialized • Not possible to automatically serialize them becausethey’re owned by an entire class, not an object • Options: • final static fields are automatically initialized (once) the first time a class is loaded • static fields initialized in the static {} block will be initialized the first time a class is loaded • But what about other static fields?
When default serialization isn’t enough • Java allows writeObject() and readObject() methods to customize output • If a class provides these methods, the serialization/deserialization mechanism calls them instead of doing the default thing
writeObject() in action public class DemoClass implements Serializable { private int _dat=3; private static int _sdat=2; private void writeObject(ObjectOutputStream o) throws IOException { o.writeInt(_dat); o.writeInt(_sdat); } private void readObject(ObjectInputStreami) throws IOException, ClassNotFoundException { _dat=i.readInt(); _sdat=i.readInt(); } }
Things that you don’t want to save • Sometimes, you want to explicitly not store some non-static data • Computed vals that are cached simply for convenience/speed • Passwords or other “secret” data that shouldn’t be written to disk • Java provides the “transient” keyword. transient foomeans don’t save foo public class MyClass implements Serializable { private int _primaryVal=3; // is serialized private transientint _cachedVal=_primaryVal*2; // _cachedVal is not serialized }
Issue: #0 -- non Serializable fields • What happens if class Foo has a field of type Bar, but Bar isn’t serializable? • If you just do this: • You get a NotSerializableException • Answer: use read/writeObjectto explicitly serialize parts that can’t be handled otherwise • Need some way to get/set necessary state Foo tmp=new Foo(); ObjectOutputStream out=new ObjectOutputStream; out.writeObject(tmp);
Issue: #0.5 -- non-Ser. superclasses • Suppose • class Foo extends Bar implements Serializable • But Bar itself isn’t serializable • What happens? Bar (not serializable) Foo (serializable)
Non-Serializable superclasses, cont’d • Bar must provide a no-arg constructor • Foo must use readObject/writeObject to take care of Bar’s private data • Java helps a bit with defaultReadObject and defaultWriteObject • Order of operations (for deserialization) • Java creates a new Foo object • Java calls Bar’s no-arg constructor • Java calls Foo’sreadObject • Foo’sreadObject explicitly reads Bar’s state data • Foo reads its own data • Foo reads its children’s data
In O’Reilly Java I/O • 父類別沒有實作Serializable介面,而且沒有提供無引數的建構子 • java.lang.Object沒有實作Serializable • 每個類別都至少有一個不能分解的父類別 • 重組時,會呼叫沒有實作Serializable的最近血緣之父類別的無引數建構子(真難懂!),以建立該物件不可分解的父類別之狀態(超複雜!) • PS: 以上原文抄錄
When having a non-serializable parent • Class ZipFile does not implements Serializable, and it does not have a no-arg constructor • public class ZipFile implements java.util.zip.ZipConstants • public ZipFile(String filename) throws IOException • public ZipFile(File file) throws ZipException, IOException • What can we do? • Can anyone answer me?
Issue: #1 -- Efficiency • For your MondoHashTable, you can just serialize/deserialize it with the default methods • But that’s not necessarily efficient, and may even be wrong • By default, Java will store the entire internal _table, including all of its null entries! • Now you’re wasting space/time to load/save all those empty cells • Plus, the hashCode()s of the keys may not be the same after deserialziation -- should explicitly rehash them to check. • hashCode() is defined in java.lang.Object • Address is usually used in the default implementation
Issue: #2 -- Backward compatibility • Suppose that you have two versions of class Foo: Foo v. 1.0 and Foo v. 1.1 • The public and protected members of 1.0 and 1.1 are the same; the semantics of both are the same • So Foo 1.0 and 1.1 should behave the same and be interchangable • BUT... The private fields and implementation of 1.0 and 1.1 are different • What happens if you serialize with a 1.0 object and deserialize with a 1.1? Or vice versa?
Backward compat, cont’d. • Issue is that in code, only changes to the public or protectedmatter • With serialization, all of a sudden, the private data members (and methods) count too • Serialization is done by the JVM, not codes in ObjectInputStream/ObjectOutputStream • This is a kind of privilege • Have to be very careful to not muck up internals in a way that’s inconsistent with previous versions • E.g., changing the meaning, but not name of some data field
Backward compat, cont’d • Example: // version 1.0 public class MyClass { MyClass(intarg) { _dat=arg*2; } private int _dat; } // version 1.1 public class MyClass { MyClass(intarg) { _dat=arg*3; } // NO-NO! private int _dat; }
Backward compat, cont’d: • Java helps as much as it can • Java tracks a “version number” of a class that changes when the class changes “substantially” • Fields changed to/from static or transient • Field or method names changed • Data types change • Class moves up or down in the class hierarchy • Trying to deserialize a class of a different version than the one currently in memory throws InvalidClassException
Yet more on backward compat • Java version number comes from names of all data and method members of a class • If they don’t change, the version number won’t change • If you want Java to detect that something about your class has changed, change a name • But, if all you’ve done is changed names (or refactored functionality), you want to be able to tell Java that nothing has changed • Can lie to Java about version number: static final long serialVersionUID = 3530053329164698194L;
The detail list of compatibility • You have to check the following rules • http://java.sun.com/javase/6/docs/platform/serialization/spec/version.html • One of the key idea is that • When restoring an object, new things are allowed, and old things should be kept
Issues #3: When facing Singleton pattern • When you are restoring a Singleton object, you need to check whether there is an existing singleton object in the system • This is logical correctness, and you need to check and guarantee it by yourself!
Default Write/Read Object • Sometimes, we want to add some additional information • For example public class NetworkWindow implements Serializable{ private Socket theSocket; //and many other fields and methods }
Recover the states public class NetworkWindow implements Serializable{ private transient Socket theSocket; //and many other fields and methods private void writeObject(ObjectOutputStream out) throws IOException { out.defaultWriteObject(); out.writeObject(theSocket.getInetAddress()); out.writeInt(theSocket.getPort()); } private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { in.defaultReadObject(); InetAddressia = (InetAddress) in.readObject(); intthePort = in.readInt(); this.theSocket = new Socket(ia, thePort); } }
Preventing Serialization • Sometimes you don’t want your class object to be serialized, but your parent implements Serializable… • You can override writeObject and readObject, and throw exceptions • throw new NotSerializableException();
Summary • Make the thing sequential, and so writable • Serialization • Serialization is difficult and technical, you need to be aware of all the class hierarchy which you are going to serialize • You can define your own serialization process • You can add additional information when serializing • You can prevent an instance from serializing