210 likes | 314 Views
A Constraint Satisfaction Approach to Testbed Embedding Services. John Byers Dept. of Computer Science, Boston University www.cs.bu.edu/~byers Joint work with Jeffrey Considine (BU) and Ketan Mayer-Patel (UNC). Experimental Methodologies. Simulation “Blank slate” for crafting experiments
E N D
A Constraint Satisfaction Approach to Testbed Embedding Services John Byers Dept. of Computer Science, Boston University www.cs.bu.edu/~byers Joint work with Jeffrey Considine (BU)and Ketan Mayer-Patel (UNC)
Experimental Methodologies • Simulation • “Blank slate” for crafting experiments • Fine-grained control, specifying all details • No external surprises, not especially realistic • Emulation • All the benefits of simulation, plus: running real protocols on real systems • Internet experimentation • None of the benefits of simulation, minus: unpredictability, unrepeatability, etc. • But realistic!
Our Position • All three approaches have their place. • Improving aspects of all three is essential. • Focus of recent workshops like MOME Tools • Internet experimentation is the most primitive by far. • Our question: Can we bridge over some of the attractive features of simulation and emulation into wide-area testbed experimentation? • Towards an answer: • Which services would be useful? • Outline design of a set of interesting services.
Useful Services • Canonical target testbed: PlanetLab • What services would we like to bridge over? • Abstract: repeatability, representativeness • Concrete: • specify parameters of an experiment just like in ns • locate one or more sub-topologies matching specification • run experiment • monitor it while running (“measurement blackboard”) • put it all in a cron job
Embedding Services • Topology specification • Testbed characterization • Relevant parameters unknown, but measurable • Embedding discovery • Automatically find one or more embeddings of specified topology Synergistic relationships between above services. • Existing measurements guide discovery. • Discovery feeds back into measurement process.
Emulab/Netbed • In the emulation world, Emulab and Netbed researchers have worked extensively on related problems [OSDI ’02, HotNets-I, CCR ’03] • Rich experimental specification language. • Optimization-based solver to map desired topology onto Netbed to: • balance load across Netbed processors • minimize inter-switch bandwidth • minimize interference between experiments • incorporate wide-area constraints
Wide-area challenges • Conditions change continuously on wide-area testbeds - “Measure twice, embed once”. • The space of possible embeddings is very large; finding feasible ones is the challenge. • We argue for a constraint satisfaction approach rather than optimization-based. • Pros and cons upcoming.
Specifying Topologies • N nodes in testbed, k nodes in specification • k x k constraint matrix C = {ci,j} • Entry ci,j constrains the end-to-end path between embedding of virtual nodes i and j. • For example, place bounds on RTTs: ci,j = [li,j, hi,j] represents lower and upper bounds on target RTT. • Constraints can be multi-dimensional. • Constraints can also be placed on nodes. • More complex specifications possible...
Feasible Embeddings • Def’n: A feasible embedding is a mapping f such that for all i, j where f(i) = x and f(j) = y: li,j ≤ d (x, y) ≤ hi,j • Do not need to know d (x, y) exactly, only that li,j ≤ l’(x, y) ≤ d (x, y) ≤ h’ (x, y) ≤ hi,j • Key point: Testbed need not be exhaustively characterized, only sufficiently well to embed.
Why Constraint-Based? • Simplicity: binary yes-no answer. • Allows sampling from feasible embeddings. • Admits a parsimonious set of measurements to locate a feasible embedding. • For infeasible set of constraints, hints for relaxing constraints can be provided. • Optimization approaches depend crucially on user’s setting of weights.
Hardness • Finding an embedding is as hard as subgraph isomorphism (NP-complete) • Counting or sampling from set of feasible embeddings is #P-Complete. • Approximation algorithms are not much better. • Uh-oh...
Our Approach • Brute force search. • We’re not kidding. • Situation is not as dire as it sounds. • Several methods for pruning the search tree. • Adaptive measurements. • Many problem instances not near boundary of solubility and insolubility. • Off-line searches up to thousands of nodes. • On-line searches up to hundreds of nodes.
Adaptive Measurements • Must we make O(N2) measurements? No. • Delay: coordinate-based [Cox et al (today)] • Loss: tomography-based [Chen et al ’03] • Underlays may make them for you [Nakao et al ’03] • In our setting: • We don’t always need exact values. • Pair-wise measurements are expensive. • How do we avoid measurements? • Interactions with search. • Inferences of unmeasured paths.
j j [10, 15 ] [10, 15 ] [ 90, 100 ] [ 90, 100 ] i i k k [ , ] 115 Triangle Inequality Inferences Suppose constraints are on delays, and the triangle inequality holds. [ , ] 75 115 Using APSP algorithms, can compute all upper & lower bounds.
Experimental Setup • Starting from PlanetLab production node list, we removed any hosts… • not responding to pings • with full file systems • with CPU load over 2.0 (measured with uptime) • 118 hosts remaining • Used snapshot of pings between them
Size 2 3 4 5 6 7 8 9 10 11 0-10ms Cliques 403 8 936 5 1475 9 1645 17 1327 8 771 8 315 8 86 3 14 3 1 1-10ms Cliques 325 35 501 61 387 84 142 60 20 0 0 0 0 0 Finding Cliques • Biggest clique of nodes within 10 ms • Unique 11 node clique covering 6 institutions • If 1ms lower bound added, • Twenty 6 node cliques • 5 institutions always present, only 2 others
# of Cliques Clique Size 2 3 4 5 6 7 1 325 501 387 142 20 0 2 6898 6238 1004 0 0 0 3 12950 0 0 0 0 0 4 0 0 0 0 0 0 Finding Groups of Cliques 1-10ms within same clique, 20-50ms otherwise
Triangle Inequality in PlanetLab • In our PlanetLab snapshot, 4.4% of all triples i,j,k violate the triangle inequality • Consider a looser version of TI, e.g. di,j ≤ α ( di,k + dk,j ) + β • There are fewer than 1% violations if • α = 1.15, β = 1 ms • α = 1.09, β = 5 ms
Inference Experiments • Metric: mean range of inference matrix • Compare measurement orderings • Random • Smallest Lower Bound First (greedy) • Largest Range First (greedy)
Random order performs poorly Largest range first performs best Inference Results
Future Work • Build a full system • More interesting constraints • Better search and pruning • Synergistic search and measurement • Integration with simulation/emulation tools • Other questions • What do search findings say about PlanetLab? • Can we plan deployment?