240 likes | 337 Views
PlanetenWachHundNetz. Instrumentation Infrastructure for PlanetLab Vitus Lorenz-Meyer. Peer-to-Peer. Problem. Distributed on open internet All participants both receive & provide services to/from others Not centrally administered Membership changes over time (churn)
E N D
PlanetenWachHundNetz Instrumentation Infrastructure for PlanetLab Vitus Lorenz-Meyer
Peer-to-Peer Problem • Distributed on open internet • All participants both receive & provide services to/from others • Not centrally administered • Membership changes over time (churn) • Example: file sharing (napster, gnutella,…) • Any node can publish a named file • Any node can obtain file from another node who has it. • Range of strategies to find nodes containing desired content Vitus Lorenz-Meyer: Thesis defense 2 University of Texas @ El Paso
The Problem P2P Rel. Work • P2P systems hard to tune, requires understanding of complex behavior • Requires instrumentation & analysis • Many P2P systems constructed without scalable instrumentation infrastructure • Frequently done in ad-hoc manner • Data transmitted to single collection & analysis node • Inadequate for understanding behavior of large systems of many (hundreds to MILLIONS) of nodes • My work: development of a flexible tool to enable scalable instrumentation • algorithms, data structures Vitus Lorenz-Meyer: Thesis defense 3 University of Texas @ El Paso
Related work (1 of 2: cousins) • Distributed Database Mngmt. Systems • Select data at sources • Optimize joins (run near to sources…) • Commercially used in non-p2p configurations • P2P (research): PIER, Sophia • Sensor Networks • Unmanaged radio-connected nodes • provide “network” of surveilance • SQL; Compiled into a 3-step process • Software communicates through same mechanism • IrisNet, TAG Vitus Lorenz-Meyer: Thesis defense 4 University of Texas @ El Paso
Related work (2of2: Siblings) • Aggregation Overlays • Information collection subsystem • Nodes provide information tuples • Internal aggregation language • Computed using parallel prefix of pre-definedassoc/comm ops • Astrolabe, SDIMS, SOMO • Google’s MapReduce • Data selection & aggregation in distributed system • User provides “map” and “reduce” program • Not fully p2p (resource mgmt. overlay)
High-level Approach Rel. Work Impl. • User specifiable programs like MapReduce • Split data collection into 3 ‘phases’ • Generate values on all nodes • Pairwise aggregation throughout system • Evaluate results • emit measured vals (val,num=1) • Aggregate: (val1+val2,num1+num2) • Evaluate (avg) (val / num) Easy to use: user provides 3 programs (scripts)
Illustration of Binary Aggregation Rel. Work Impl.
Why is this hard in P2P? Rel. Work Impl. • Problem: membership churn • Nodes continuously enter & leave system • Nobody in charge (p2p) • Nobody knows membership list! Exposes following challenges • Finding all participating nodes • Constructing an (appx) balanced tree
Building Structure Upon Anarchy: Key Based Routing Goal Appr. 02160 2158 2159 + 2158 2159 Vitus Lorenz-Meyer: Thesis defense 9 University of Texas @ El Paso
“Chord” Routing Appr. Goal Vitus Lorenz-Meyer: Thesis defense 10 University of Texas @ El Paso
Chord lookup Appr. Goal Vitus Lorenz-Meyer: Thesis defense 11 University of Texas @ El Paso
f a a a b d b a d e i b a d e h f g i g e b h Building a tree upon KBR Appr. Goal Vitus Lorenz-Meyer: Thesis defense 12 University of Texas @ El Paso
Building a tree: FTT & KBT Appr. Goal • KBT: Maps tree on key-space • Operation associated w/ target node • System/tree-node mapping: • Node assigned to node w/ nearest key • Non-ambiguous • Tree useful for both dissemination & aggregation • Single, global tree FTT: finger-based tree • Operation associated with a “target” node • Systems send data to finger closest to target • Ambiguous • Depends on all nodes’ fingertables • Tree useful only for aggregation Vitus Lorenz-Meyer: Thesis defense 13 University of Texas @ El Paso
Our Structure Goal Appr. 101… 101… 001… 001… 110… As 001..! 101… 001… 011… 111… 011… 101… 001… 100… 000… 111… 110… 011… 010… • KMR: Subset of KBT, rooted at specific node • One tree / root • Better load-balancing • Tree fully determined by set of active nodes and root
Implementation details Appr. Goal • PWHN-Server layered on FreePastry • PWHN-Client connects to PWHN-Server and makes query • Callee builds tree making itself root S S S S S S S S C Vitus Lorenz-Meyer: Thesis defense 15 University of Texas @ El Paso
Our Goal Example Impl. details • Develop toolkit for data collection/aggregation in P2P networks • Useful for PlanetLab-community • Extend MR’s model to P2P • K.I.S.S. • Users provide programs for gen/agg/eval • Use techniques from P2P • Construct aggregation tree upon key-based-routing Vitus Lorenz-Meyer: Thesis defense 16 University of Texas @ El Paso
Example (1) Goal Evaluation • First implementation: • Script version, flat, to test approach • Example 1: Overall average system load • Gen emits (1,<1load>,<5load>,<15load>) for each server • Agg adds all numbers • Eval divides last 3 numbers by first to get average Vitus Lorenz-Meyer: Thesis defense 17 University of Texas @ El Paso
Example (2) Goal Evaluation • PWHN client (Java) • Can start and stop server • Used for specifying all programs and parameters (Servers, username for flat, method) • Front-end for connecting to servers and making query • Allows saving and graphically representing result Vitus Lorenz-Meyer: Thesis defense 18 University of Texas @ El Paso
Example (3) Goal Evaluation • Graphing of queried results Bar Chart Color bubbles on world map Vitus Lorenz-Meyer: Thesis defense 19 University of Texas @ El Paso
Example (4) Goal Evaluation Graphing of tree Graphing of paths of the query Vitus Lorenz-Meyer: Thesis defense 20 University of Texas @ El Paso
Evaluation Examples Synopsis • Minimize disruption • Minimize incoming bytes to client • More efficient • Lower average fan-in of aggregation tree Vitus Lorenz-Meyer: Thesis defense 21 University of Texas @ El Paso
Evaluation: Fern Examples Synopsis • Global Update latency histogram 10 clients 701 clients Vitus Lorenz-Meyer: Thesis defense 22 University of Texas @ El Paso
Summary Examples • PWHN - Instrumentation toolkit • Extends MR’s model to P2P • Uses P2P techniques (DHTs) • Combines FTT and KBT to be more efficient • Conclusion: Useful tool that is more efficient than to build infrastructure into software • What did I do? • Survey of systems that provide aggregation in dynamic networks • Classification and naming of aggregation trees upon DHTs • Design and implementation of my own tool (KMR/PWHN) Vitus Lorenz-Meyer: Thesis defense 23 University of Texas @ El Paso
Questions Synopsis Vitus Lorenz-Meyer: Thesis defense 24 University of Texas @ El Paso