240 likes | 329 Views
Challenges in Evaluating Distributed Algorithms. Idit Keidar The Technion & MIT FuDiCo, Bertinoro, June 2002. Distributed Algorithm Evaluation. Aspects: Performance, reliability, availability, … Techniques: Theory: define models, metrics Simulations: define model, metrics
E N D
Challenges in Evaluating Distributed Algorithms Idit Keidar The Technion & MIT FuDiCo, Bertinoro, June 2002
Distributed Algorithm Evaluation • Aspects: Performance, reliability, availability, … • Techniques: • Theory: define models, metrics • Simulations: define model, metrics • Experimental: choose environment, patterns, metrics, …
Significance of Metrics/Models • Conclusions depend on choice of model/metric • Algorithms designed to optimize specific metrics • Metrics should be simple to work with • Must make simplifying assumptions
Examples of Simplifying Assumptions • Time complexity: • All messages take equal time • Reliability: • Bounded potential number of failures (t of n), independent of system life span • IID models: • Independent failures (e.g., message loss) • All failures equally likely
Example: Time Complexity • Usual metric: # synchronous rounds • Or asynchronous “steps” • Underlying assumption: • All rounds / steps cost the same
Performance Evaluation of a Communication Round over the Internet Omar Bakr & Idit Keidar To appear in PODC 02
Communication Round • Part of many distributed algorithms, systems • consensus, atomic commit, replication, group membership ... • Hosts send data to each other • potentially to all connected hosts • Common metric for evaluating algorithms: number of rounds
Questions • What is the best way to implement a communication round over the Internet • decentralized vs. centralized • How long is a communication round over the Internet? • Are all communication rounds the same?
Prediction is Hard • Internet is unpredictable, diverse, … • Different answers for different topologies, different times • Different performance metrics • local running time at a host • overall running time from first start to last finish
All-to-all Leader “Communication Round” Primitive • Initiated by some host • Propagates data from every host to every other host connected to it • Example implementations:
Experiment I • 10 hosts: Taiwan, Korea, US academia, ISPs • TCP/IP • Algorithms: • All-to-all • Leader (initiator) • Secondary leader (not initiator) periodically initiated at each host - 650 times over 3.5 days
Overall Running Time • Elapsed time from initiation (at initiator) until all hosts terminate • Requires estimating clock differences • Clocks not synchronized, drift • We compute difference over short intervals • Compute 3 different ways • Achieve accuracy within 20 ms. on 90% of runs
All-to-all: 2 Leader: 3 Secondary Leader: 4 Teaser: Counting Rounds
150+240 = 390 150+150+150 = 450 Counting Overall Running TimeFrom MIT • Ping-measured latencies (IP): • Longest link latency 240 ms • Longest link to MIT 150 ms
All-to-all overall local Leader overall local Sec. Leader overall local Average (runs under 2 sec) 811 295 541 335 585 408 % runs over 2 seconds 55% 3% 13% 6% 9% 3% Running times in milliseconds Measured Running Times Runs Initiated at MIT
What’s Going On? • Very high loss rates on two links • 42% and 37% • Taiwan to two US ISPs • Loss rates on other links up to 8% • Upon loss, TCP’s timeout is big • More than round-trip-time • All-to-all sends messages on lossy links • Often delayed by loss
All-to-all overall local Leader overall local Sec. Leader overall local Average (runs over 2 sec) 866 645 1120 844 679 607 % runs over 2 seconds 54% 24% 64% 43% 13% 7% Measured Running Times Runs Initiated at Taiwan Running times in milliseconds
Experiment II: Removing Taiwan • Overall running times much better • For every initiator and algorithm, less than 10% over 2 seconds • All-to-all overall still worse than others! • either Leader or Secondary Leader best, depending on initiator • loss rates of 2% - 8% are not negligible • all-to-all sends O(n2) messages; suffers • But, all-to-all has best local running times
Probability of Delay due to Loss • If all links would have same latency • assume 1% loss on all links; 10 hosts (n=10) • Leader sends 3(n-1) = 27 messages • probability of at least one loss: 1 -.9927 »24% • All-2-all sends n(n-1) = 90 messages • probability of at least one loss: 1 -.9990 »60% • In reality, links don’t have same latency • only loss on long links matters
Experiment Conclusions • Message loss causes high variation in TCP link latencies • latency distribution has high variance, heavy tail • Latency distribution determines expected cost of sending O(n) concurrent messages • Constant distribution -> constant • Exponential distribution -> Log(n) [RS94] • Worse for heavier tails • Secondary leader helps • No triangle inequality • Different for overall vs. local running times
And Now, Back to Theory • Number of rounds not sufficient metric • E.g., One-to-all and all-to-all have different costs • Refine metric: • Say which “kind” of rounds • Lower bound for asynchronous consensus – 2 rounds, but of which kind? • Paxos does it in: • 1 transmission from one to quorum + • 1 transmission from quorum to all
DalgevalDistributed algorithm evaluation • Goal: Develop realistic ways to evaluate distributed algorithms • Gather data on failures, performance in various environments • Use data to create models for prediction • theoretical analysis and simulations • Experiment to validate prediction • Design algorithms to optimize for correct metrics