430 likes | 440 Views
Explore the benefits and drawbacks of distributed state management, including consistency, reliability, and complexity aspects. Learn about different approaches to handle consistency, crash sensitivity, time and space overheads, and trade-offs involved. Discover real-world applications and solutions in the realm of distributed systems.
E N D
490dpPart I: ChallengesIntermezzo: ApplicationsPart II: Java Object Serialization Robert Grimm
Challenges • Pervasive computing • Vision: Focus on users and their tasks • Enabled by ubiquitous smart devices • Central question • How can devices get users’ tasks done? • They need to work together!
Distributed State “Information retained in one place that describes something, or is determined by something, somewhere else in the system” • Examples • Association between addresses and names • Sequence number to identify most recent data • File block cached in memory of client • List of clients caching a file
Why is Distributed State Good? • Performance • Not going over the network saves time • Example: Local cache of files • Coherency • Easier to coordinate based on knowledge • Example: Server notification when cache expires • Reliability • Replication makes it possible to tolerate failures • Example: Same files stored on two servers
Why is Distributed State Bad? • Consistency • Crash sensitivity • Time and space overheads • Complexity
Consistency • Problem: Keep copies consistent • Approaches • Detect stale data on use • Treat copy as hint • Example: name-to-address map • Prevent inconsistency • Require exclusive ownership before modifying • Example: all operations go through one node • Tolerate inconsistency • Make window of inconsistency small • Example: delays in network games
Crash Sensitivity • Problem: Mask failures • Approaches • Reconstruct state • Example: reopening files in Sprite FS • Limit degree of distribution / affected state • Example: partition files according to usage • Fully replicate state • Example: Coda file system
Time and Space Overheads • Time • Go across the network • Space • Distributed copies • Tracking distributed copies • Overheads depend on • Degree of sharing • Degree of modification
Complexity • Distributed state requires • Maintaining consistency • Masking of failures • Distributed state makes it harder to • Debug • Tune
Trade-offs • Consistency • Availability • Scalability • Complexity
No Perfect Solution • Solution needs to be “good enough”
What Means “Good Enough” • Depends on application domain • Make an informed trade-off • Examples • Cluster-based services • Porcupine • Distributed Data Structures • Disconnected storage services • Epidemic replication • Two-tier replication
Porcupine • Cluster-based email server • Assumptions • Email typically doesn’t get modified • Deleted emails may reappear (temporarily) • Eventual consistency • But, availability and scalability [Saito et al. 99]
Distributed Data Structures • Cluster-based hash table • Assumptions • Network is fast and doesn’t partition • Nodes fail infrequently • OK to return failure at storage layer • Consistency, availability, and scalability [Gribble et al. 00]
Bayou • Epidemic replication [Demers et al. 87] • Two nodes periodically synchronize state • Only pair-wise connectivity • Structured storage (database) • Eventually consistent • But, always available [Petersen et al. 97]
Coda • First-tier nodes • Fully connected • Store all data • Second-tier nodes • Often disconnected • Store subset of data • Limited consistency, but greater availability [Kistler & Satya 92, Mummert et al. 95]
Conflicts • Caused by competing updates • Detected “after the fact” • Need to be resolved automatically
Conflict Resolution Techniques • Based on data • Timestamps • Heuristics • Programs [Kumar & Satya 95, Reiher et al. 94] • Part of update: Bayou [Terry et al. 95] • Dependency check • Merge procedure
Morals • No perfect solution • Need to exploit application domain • Complexity grows very quickly • Beware of special case code (recovery)
Intermezzo: Applications • Team 1: Cluster-based application • Scalable Napster / Gnutella repository • Scalable document repository • Leased storage • Customizable actions when leases expire • Team 2: Roving application • Personal jukebox • PIM on steroids • Universal inbox
A A B B C C D D Java Object Serialization • Problem • Turn graph of objects into byte string • Turn byte string back into graph of objects
A A B C B C D D D The Basic Idea • Write a description of each object • Keep track of each written object • <1:A <2:B <3:D>> <4:C ref(3)>>
All Things Serializable • Not everything is serializable • java.lang.Object • java.lang.Thread • Serializable objects implementjava.io.Serializable • An empty marker interface
Default Serialization • Writes out all fields • Independent of their access controls (private, package private, protected, public) • Good style to document invariants • Use @serial tag @serial Must not be <code>null</code>
Default Deserialization • Allocates memory for new object • No constructor invoked • Fields initialized to their default values • Reads in all fields • Independent of their access controls
Transient Fields • Some fields shouldn’t or can’t be serialized private Object lock; • How to prevent default serializationfrom trying to write them out? • Declare such fields as “transient” private transient Object lock; • Restored to defaults during deserialization • null in above example
Overriding Serialization • Customize serialization by implementing private void writeObject(ObjectOutputStream) throws IOException; • Good style to document customization • Use @serialData tag
Overriding Serialization • Example: Thread-safe serialization private void writeObject(ObjectOutputStream out) throws IOException { synchronized (lock) { out.defaultWriteObject(); }}
Overriding Serialization • Example: Filter elements from a list • Declare list to be transient • In writeObject() • Invoke default serialization • Iterate over list, writing filtered elements out.writeObject(el); • Write end-of-list marker out.writeObject(Boolean.FALSE); • Alternatively, write length & elements
Overriding Deserialization • Customize deserialization by implementing private void readObject(ObjectInputStream) throws IOException, ClassNotFoundException;
Overriding Deserialization • Example: Restore lock private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { in.defaultReadObject(); lock = new Object();}
Overriding Deserialization • Example: Restore list • In readObject() • Invoke default deserialization • Read filtered elements until end-of-list marker • Alternatively, read length & elements
Notes on Customization • Don’t perform operations that take a long time • No I/O besides accessing object stream • Swing UI elements are serializable • But are not designed for long-term storage • Declare them transient • Restore UI in application logic
The Replacements • Example: Symbols — there can only be one private Object readResolve() throws ObjectStreamException { return intern(name);} • Done after object graph has been restored • Embedded self referencesare not replaced!
Inheritance • If a superclass implements Serializable,all subclasses are also serializable • Each class in such a hierarchyserializes only its own state • Classes can control all stateby implementing java.io.Externalizable • If superclass is not Serializable,a serializable subclass must handle the superclass’s state
Inheritance • To make a subclass of a serializable classnot serializable private void writeObject( ObjectOutputStream o) throws IOException { throw new NotSerializableException( getClass().getName());} • This indicates a semantic problem!
Versioning • Problem • Classes can change • While instance is in serialized form • Solution • Let classes declare their version • Define what are compatible changes
Stream Unique Identifier (SUID) • Hash of the class • Determined by serialver tool • Accessible in Java throughObjectStreamClass.getSerialVersionUID() • Modified version declares same SUIDas original version private static final long serialVersionUID = …;
Incompatible Changes • Deleting fields • Moving classes up or down in the hierarchy • Changing non-static fields to static • Changing non-transient fields to transient • Changing the declared type of a field • Adding / removing access to default fieldsfrom writeObject() / readObject() • See specification!
Compatible Changes • Adding fields • Adding / removing classes • Adding Serializable • Adding / removing writeObject() / readObject() • Changing static fields to non-static • Changing transient fields to non-transient
Security • Serialized objects expose their internal state • If that state is sensitive it must be protected • Don’t serialize sensitive state • Encrypt sensitive state • Encrypt serialized objects
There’s More • serialPersistentFields to declare the serialized format • Useful for backwards compatibility • ObjectInputValidation to validate deserialized objects • Class descriptors • Serialized form of a class • Two versions of serialization protocol