240 likes | 355 Views
PLATO: Predictive Latency-Aware Total Ordering. Mahesh Balakrishnan Ken Birman Amar Phanishayee. Total Ordering. a.k.a Atomic Broadcast delivering messages to a set of nodes in the same order messages arrive at nodes in different orders… nodes agree on a single delivery order
E N D
PLATO: Predictive Latency-Aware Total Ordering Mahesh Balakrishnan Ken Birman Amar Phanishayee
Total Ordering • a.k.a Atomic Broadcast • delivering messages to a set of nodes in the same order • messages arrive at nodes in different orders… • nodes agree on a single delivery order • messages are delivered at nodes in the agreed order
Modern Datacenters • Applications • E-tailers, Finance, Aerospace • Service-Oriented Architectures, Publish-Subscribe, Distributed Objects, Event Notification… • … Totally Ordered Multicast! • Hardware • Fast high-capacity networks • Failure-prone commodity nodes
Total Ordering in a Datacenter Updates are Totally Ordered Replicated Service Totally Ordered Multicast is used to consistently update Replicated Services Latency of Multicast System Consistency Requirement: order multicasts consistently, rapidly, robustly
Multicast Wishlist • Low Latency! • High (stable) throughput • Minimal, proactive overheads • Leverage hardware properties • HW Multicast/Broadcast is fast, unreliable • Handle varying data rates • Datacenter workloads have sharp spikes… and extended troughs!
State-of-the-Art • Traditional Protocols • Conservative • Latency-Overhead tradeoff • Example: Fixed Sequencer • Simple, works well • Optimistic Total Ordering: • deliver optimistically, rollback if incorrect • Why this works – No out-of-order arrival in LANs • Optimistic total ordering for datacenters?
PLATO: Predictive Ordering • In a datacenter, broadcast / multicast occurs almost instantaneously • Most of the time, messages arrive in same order at all nodes. • Some of the time, messages arrive in different orders at different nodes. • Can we predict out-of-order arrival?
Reasons for Disorder: Swaps Typical Datacenter Diameter: 50-500 microseconds Out-of-order arrival can occur when the inter-send interval between two messages is smaller than the diameter of the network
Reasons for Disorder: Loss Datacenter networks are over-provisioned Loss never occurs in the network Datacenter nodes are cheap Loss occurs due to end-host buffer overflows caused by CPU contention
Disorder: Emulab3 Percentage of swaps and losses goes up with data rate At 2800 packets per sec, 2% of all packet pairs are swapped and 0.5% of packets are lost.
Predicting Disorder • Predictor: Inter-arrival time of consecutive packets into user-space • Why? • Swaps: simultaneous multicasts low inter-arrival time • Loss: kernel buffer overflow sequence of low inter-arrival times
Predicting Disorder • 95% of swaps and 14% of all pairs are within 128 µsecs Inter-arrival time of swaps Inter-arrival time of all pairs Cornell Datacenter, 400 multicasts/sec
PLATO Design • Heuristic: If two packets arrive within Δ µsecs, possibility of disorder • PLATO • Heuristic + Lazy Fixed Sequencer • Heuristic works ~ zero (Δ) latency • Heuristic fails fixed sequencer latency
PLATO Design API: optdeliver, confirm, revoke Ordering Layer: Pending Queue: Packets suspected to be out-of-order, or queued behind suspected packets Suspicious Queue: Packets optdeliveredto the application, not yet confirmed
Performance Fixed Sequencer PLATO At small values of Δ, very low latency of delivery but more rollbacks
Performance Latency of both Fixed-Sequencer and PLATO decreases as throughput increases
Performance Traffic Spike: PLATO is insensitive to data rate, while Fixed Sequencer depends on data rate
Performance Latency is as good as static Δ parameterization Δ is varied adaptively in reaction to rollbacks
Conclusion • First optimistic total order protocol that predicts out-of-order delivery • Slashes ordering latency in datacenter settings • Stable at varying loads • Ordering layer of a time-critical protocol stack for Datacenters