240 likes | 422 Views
Object Serialization in Java. Or: The Persistence of Memory…. So you want to save your data…. Common problem: You’ve built a large, complex object Spam/Normal statistics tables Game state Database of student records Etc… Want to store on disk and retrieve later
E N D
Object Serialization in Java Or: The Persistence of Memory…
So you want to save your data… • Common problem: • You’ve built a large, complex object • Spam/Normal statistics tables • Game state • Database of student records • Etc… • Want to store on disk and retrieve later • Or: want to send over network to another Java process • In general: want your objects to be persistent -- outlive the current Java process
Answer I: Homebrew file formats • You’ve got file I/O nailed, so… • Write a set of methods for saving/loading each class that you care about public class MyClass { public void saveYourself(Writer o) throws IOException { … } public static MyClass loadYourself(Reader r) throws IOException { … } }
Coolnesses of Approach 1: • Can produce arbitrary file formats • Know exactly what you want to store and get back/don’t store extraneous stuff • Can build file formats to interface w/ other codes/programs • XML • Tab-delimited/spreadsheet • Etc. • If your classes are nicely hierarchical, makes saving/loading simple
Saving/Loading Recursive Data Structs public interface Saveable { public void saveYourself(Writer w) throws IOException; // should also have this // public static Object loadYourself(Reader r) // throws IOException; // but you can’t put a static method in an // interface in Java }
Saving, cont’d public class MyClassA implements Saveable { public MyClassA(int arg) { // initialize private data members of A } public void saveYourself(Writer w) throws IOException { // write MyClassA identifier and private data on // stream w } public static MyClassA loadYourself(Reader r) throws IOException { // parse MyClassA from the data stream r MyClassA tmp=new MyClassA(data); return tmp; } }
Saving, cont’d public class MyClassB implements Saveable { public void MyClassB(int arg) { … } private MyClassA _stuff; public void saveYourself(Writer w) { // write ID for MyClassB _stuff.saveYourself(w); // write other private data for MyClassB w.flush(); } public static MyClassB loadYourself(Reader r) { // parse MyClassB ID from r MyClassA tmp=MyClassA.loadYourself(r); // parse other private data for MyClassB return new MyClassB(tmp); } }
Painfulnesses of Approach 1: • This is called recursive descent parsing (and formatting) • We’ll use it in project 2, and there are plenty of places in the Real World (TM) where it’s terribly useful. • But... It’s also a pain in the a** • If all you want to do is store/retrieve data, do you really need to go to all of that effort? • Fortunately, no. Java provides a shortcut that takes a lot of the work out.
Approach 2: Enter Serialization... • Java provides the serialization mechanism for object persistence • It essentially automates the grunt work for you • Short form: public class MyClassA implements Serializable { ... } // in some other code elsewhere... MyClassA tmp=new MyClassA(arg); FileOutputStream fos=new FileOutputStream(“some.obj”); ObjectOutputStream out=new ObjectOutputStream(fos); out.writeObject(tmp); out.flush(); out.close();
In a bit more detail... • To (de-)serialize an object, it must implements Serializable • All of its data members must also be marked serializable • And so on, recursively... • Primitive types (int, char, etc.) are all serizable automatically • So are Strings, most classes in java.util, etc. • This saves/retrieves the entire object graph, including ensuring uniqueness of objects
The object graph and uniqueness Entry MondoHashTable “tyromancy” Vector Entry “zygopleural”
Now for some subtleties... • static fields are not automatically serialized • Not possible to automatically serialize them b/c they’re owned by an entire class, not an object • Options: • final static fields are automatically initialized (once) the first time a class is loaded • static fields initialized in the static {} block will be initialized the first time a class is loaded • But what about other static fields?
When default serialization isn’t enough • Java allows writeObject() and readObject() methods to customize output • If a class provides these methods, the serialization/deserialization mechanism calls them instead of doing the default thing
writeObject() in action public class DemoClass implements Serializable { private int _dat=3; private static int _sdat=2; private void writeObject(ObjectOutputStream o) throws IOException { o.writeInt(_dat); o.writeInt(_sdat); } private void readObject(ObjectInputStream i) throws IOException, ClassNotFoundException { _dat=i.readInt(); _sdat=i.readInt(); } }
Things that you don’t want to save • Sometimes, you want to explicitly not store some non-static data • Computed vals that are cached simply for convenience/speed • Passwords or other “secret” data that shouldn’t be written to disk • Java provides the “transient” keyword. transient foo==don’t save foo public class MyClass implements Serializable { private int _primaryVal=3; // is serialized private transient int _cachedVal=_primaryVal*2; // _cachedVal is not serialized }
Gotchas: #0 -- non Serializable fields • What happens if class Foo has a field of type Bar, but Bar isn’t serializable? • If you just do this: • You get a NotSerializableException (bummer) • Answer: use read/writeObject to explicitly serialize parts that can’t be handled otherwise • Need some way to get/set necessary state Foo tmp=new Foo(); ObjectOutputStream out=new ObjectOutputStream; out.writeObject(tmp);
Gotchas: #0.5 -- non-Ser. superclasses • Suppose • class Foo extends Bar implements Serializable • But Bar itself isn’t serializable • What happens?
Non-Serializable superclasses, cont’d • Bar must provide a no-arg constructor • Foo must use readObject/writeObject to take care of Bar’s private data • Java helps a bit with defaultReadObject and defaultWriteObject • Order of operations (for deserialization) • Java creates a new Foo object • Java calls Bar’s no-arg constructor • Java calls Foo’s readObject • Foo’s readObject explicitly reads Bar’s state data • Foo reads its own data • Foo reads its children’s data
Gotchas: #1 -- Efficiency • For your MondoHashTable, you can just serialize/deserialize it with the default methods • But that’s not necessarily efficient, and may even be wrong • By default, Java will store the entire internal _table, including all of its null entries! • Now you’re wasting space/time to load/save all those empty cells • Plus, the hashCode()s of the keys may not be the same after deserialziation -- should explicitly rehash them to check.
Gotchas: #2 -- Backward compatibility • Suppose that you have two versions of class Foo: Foo v. 1.0 and Foo v. 1.1 • The public and protected members of 1.0 and 1.1 are the same; the semantics of both are the same • So Foo 1.0 and 1.1 should behave the same and be interchangable • BUT... The private fields and implementation of 1.0 and 1.1 are different • What happens if you serialize with a 1.0 object and deserialize with a 1.1? Or vice versa?
Backward compat, cont’d. • Issue is that in code, only changes to the public or protected interfaces matter • With serialization, all of a sudden, the private data memebers (and methods) count too • Have to be very careful to not muck up internals in a way that’s inconsistent with previous versions • E.g., changing the meaning, but not name of some data field
Backward compat, cont’d • Example: // version 1.0 public class MyClass { MyClass(int arg) { _dat=arg*2; } private int _dat; } // version 1.1 public class MyClass { MyClass(int arg) { _dat=arg*3; } // NO-NO! private int _dat; }
Backward compat, cont’d: • Java helps as much as it can • Java tracks a “version number” of a class that changes when the class changes “substantially” • Fields changed to/from static or transient • Field or method names changed • Data types change • Class moves up or down in the class hierarchy • Trying to deserialize a class of a different version than the one currently in memory throws InvalidClassException
Yet more on backward compat • Java version number comes from names of all data and method members of a class • If they don’t change, the version number won’t change • If you want Java to detect that something about your class has changed, change a name • But, if all you’ve done is changed names (or refactored functionality), you want to be able to tell Java that nothing has changed • Can lie to Java about version number: static final long serialVersionUID = 3530053329164698194L;