380 likes | 552 Views
Implementing Declarative Overlays. Boon Thau Loo 1 Tyson Condie 1 , Joseph M. Hellerstein 1,2 , Petros Maniatis 2 , Timothy Roscoe 2 , Ion Stoica 1 1 University of California at Berkeley, 2 Intel Research Berkeley. Overlays Everywhere…. Overlay networks are widely used today:
E N D
Implementing Declarative Overlays Boon Thau Loo1 Tyson Condie1, Joseph M. Hellerstein1,2, Petros Maniatis2, Timothy Roscoe2, Ion Stoica1 1University of California at Berkeley, 2Intel Research Berkeley
Overlays Everywhere… • Overlay networks are widely used today: • Routing and forwarding component of large-scale distributed systems • Provide new functionality over existing infrastructure • Many examples, variety of requirements: • Packet delivery: Multicast, RON • Content delivery: CDNs, P2P file sharing, DHTs • Enterprise systems: MS Exchange Overlay networks are an integral part of many large-scale distributed systems.
Problem • Non-trivial to design, build and deploy an overlay correctly: • Iterative design process: • Desired properties Distributed algorithms and protocols Simulation Implementation Deployment Repeat… • Each iteration takes significant time and utilizes a variety of expertise
The Goal of P2 • Make overlay development more accessible: • Focus on algorithms and protocol designs, not the implementation • Tool for rapid prototyping of new overlays: • Specify overlay network at a high level • Automatically translate specification to protocol • Provide execution engine for protocol • Aim for “good enough” performance • Focus on accelerating the iterative design process • Can always hand-tune implementation later
Outline • Overview of P2 • Architecture By Example • Data Model • Dataflow framework • Query Language • Chord • Additional Benefits • Overlay Introspection • Automatic Optimizations • Conclusion
Traditional Overlay Node Traditional Overlay Node Packets Out Packets In Overlay Program
Overlay description: declarative query language Runtime dataflows maintain network state Overlay description: dataflow scripting language Planner P2 Overlay Node Packets Out Packets In Overlay Program P2 Query Processor
Advantages of the P2 Approach • Declarative Query Language • Concise/high level expression • Statically checkable (termination, correctness) • Ease of modification • Unifying framework for introspection and implementation • Automatic optimizations • Query and dataflow level
Data Model • Relational data: relational tables and tuples • Two kinds of tables: • Stored, soft state: • E.g. neighbor(Src,Dst), forward(Src,Dst,NxtHop) • Transient streams: • Network messages: message (Rcvr, Dst) • Local timer-based events: periodic (NodeID,10)
Dataflow framework • Dataflow graph • C++ dataflow elements • Similar to Click: • Flow elements (mux, demux, queues) • Network elements (cc, retry, rate limitation) • In addition: • Relational operators (joins, selections, projections, aggregation)
Outline • Overview of P2 • Architecture By Example • Data Model • Dataflow framework • Query Language • Chord in P2 • Additional Benefits • Overlay Introspection • Automatic Optimizations • Conclusion Simple ring routing example
Example: Ring Routing Each node has an address and an identifier Objects “served” by successor Each object has an identifier. Every node knows its successor
Ring State • Stored tables: • node(NAddr, N) • succ(NAddr, Succ, SAddr) node(IP58,58) succ(IP58,60,IP60) node(IP40,40) succ(IP40,58,IP58)
Example: Ring lookup • Find the responsible node for a given key k? n.lookup(k) if k in (n, n.successor) return n.successor.addr else return n.successor. lookup(k)
Ring Lookup Events • Event streams: • lookup(Addr, Req, K) • response(Addr, K, Owner) node(IP58,58) succ(IP58,60,IP60) response(IP37,59,IP60) node(IP40,40) succ(IP40,58,IP58) n.lookup(k) if k in (n, n.successor) return n.successor.addr else return n.successor. lookup(k) lookup(IP58,IP37,59) lookup(IP37,IP37,59) lookup(IP40,IP37,59)
Pseudocode Dataflow “Strands” • Pseudocode: • n.lookup(k) • if k in (n, n.successor) • return n.successor.addr • else • return n.successor. lookup(k)
Dataflow Strand Strand Elements Element2 Element1 … Actions Event Stream Elementn Event: Incoming network messages, periodic timers Condition: Process event using strand elements Action: Outgoing network messages, local table updates
Pseudocode Strand 1 • Stored tables • node(NAddr, N) • succ(NAddr, Succ, SAddr) n.lookup(k) if k in (n, n.successor) return n.successor.addr else return n.successor.lookup(k) • Event streams • lookup(Addr, Req, K) • response(Addr, K, Owner) Event: RECEIVElookup(NAddr, Req, K) Condition: node(NAddr, N) & succ(NAddr, Succ, SAddr) & K in (N, Succ] Action: SEND response(Req, K, SAddr) to Req
lookup succ node Pseudocode to Strand 1 Event:RECEIVE lookup(NAddr, Req, K) Condition: node(NAddr, N) & succ(NAddr, Succ, SAddr) & K in (N, Succ] Action:SEND response(Req, K, SAddr) to Req Dataflow strand Match lookup.Addr = succ.Addr Match lookup.Addr = node.Addr Join Join Format Response(Req,K,SAddr) Project Filter K in (N,Succ) Select Response n.lookup(k) if k in (n, n.successor) return n.successor.addr else return n.successor. lookup(k)
lookup succ node Pseudocode to Strand 2 Event: RECEIVE lookup(NAddr, Req, K) Condition: node(NAddr, N) & succ(NAddr, Succ, SAddr) & K not in (N, Succ] Action:SEND lookup(SAddr, Req, K) to SAddr Dataflow strand Join lookup.Addr = succ.Addr Join lookup.Addr = node.Addr Select K not in (N,Succ) Project lookup(SAddr,Req,K) lookup n.lookup(k) if k in (n, n.successor) return n.successor.addr else return n.successor. lookup(k)
Strand Execution lookup lookup lookup response lookup/ response lookup
Query Language: Overlog • “SQL” equivalent for overlay networks • Based on Datalog: • Declarative recursive query language • Well-suited for querying properties of graphs • Well-studied in database literature • Static analysis, optimizations, etc • Extensions: • Data distribution, asynchronous messaging, periodic timers and state modification
Query Language: Overlog Datalog rule syntax: <head> <condition1>, <condition2>, … , <conditionN>. Overlog rule syntax: <Action><event>,<condition1>, … , <conditionN>.
Query Language: Overlog Overlog rule syntax: <Action><event>,<condition1>, … , <conditionN>. Event: RECEIVE lookup(NAddr, Req, K) Condition: lookup(NAddr, Req, K) & node(NAddr, N) & succ(NAddr, Succ, SAddr) & K in (N, Succ] Action: SEND response(Req, K, SAddr) to Req • response@Req(Req, K, SAddr) lookup@NAddr(Naddr, Req, K), node@NAddr(NAddr, N), succ@NAddr(NAddr, Succ, SAddr), K in (N,Succ].
P2-Chord • Chord Routing, including: • Multiple successors • Stabilization • Optimized finger maintenance • Failure recovery • 47 OverLog rules • 13 table definitions • Other examples: • Narada, flooding, routing protocols 10 pt font
Performance Validation • Experimental Setup: • 100 nodes on Emulab testbed • 500 P2-Chord nodes • Main goals: • Validate expected network properties
Sanity Checks • Logarithmic diameter and state (“correct”) • BW-efficient: 300 bytes/s/node
Churn Performance • Metric: Consistency [Rhea at al] • P2-Chord: • P2-Chord@64mins: 97% consistency • P2-Chord@16mins: 84% consistency • P2-Chord@8min: 42% consistency • Hand-crafted Chord: • MIT-Chord@47mins: 99.9% consistency • Outperforms P2 under higher churn • Not intended to replace a carefully hand-crafted Chord
Benefits of P2 • Introspection with Queries • Automatic optimizations • Reconfigurable Transport (WIP)
Introspection with Queries With Atul Singh (Rice) and Peter Druschel (MPI) • Unifying framework for debugging and implementation • Same query language, same platform • Execution tracing/logging • Rule and dataflow level • Log entries stored as tuples and queried • Correctness invariants, regression tests as queries: • “Is the Chord ring well formed?” (3 rules) • “What is the network diameter?” (5 rules) • “Is Chord routing consistent?” (11 rules)
Join lookup.Addr = succ.Addr lookup Project lookup(SAddr,Req,K) Select K not in (N,Succ) Join lookup.Addr = node.Addr lookup response Project Response(Req,K,SAddr) Select K in (N,Succ) Join lookup.Addr = succ.Addr lookup Join lookup.Addr = node.Addr Automatic Optimizations • Application of traditional Datalog optimizations to network routing protocols (SIGCOMM 2005) • Multi-query sharing: • Common “subexpression” elimination • Caching and reuse of previously computed results • Opportunistically share message propagation across rules
Automatic Optimizations • Cost-based optimizations • Join ordering affects performance Join lookup.Addr = succ.Addr Join lookup.Addr = succ.Addr Join lookup.Addr = node.Addr lookup Project lookup(SAddr,Req,K) Select K not in (N,Succ) Join lookup.Addr = node.Addr lookup Project Response(Req,K,SAddr) response Select K in (N,Succ)
Open Questions • The role of rapid prototyping? • How good is “good enough” performance for rapid prototypes? • When do developers move from rapid prototypes to hand-crafted code? • Can we get achieve “production quality” overlays from P2?
Future Work • “Right” language • Formal data and query semantics • Static analysis • Optimizations • Termination • Correctness
Conclusion • P2: Declarative Overlays • Tool for rapid prototyping new overlay networks • Declarative Networks • Research agenda: Specify and construct networks declaratively • Declarative Routing : Extensible Routing with Declarative Queries (SIGCOMM 2005)
Thank You http://p2.cs.berkeley.edu
Latency CDF for P2-Chord Median and average latency around 1s.