180 likes | 325 Views
Evaluation of a Novel Two-Step Server Selection Metric. Presented by Karthik Lakshminarayanan 11-26-2003. Problem statement. Goal : Client wants to download content from the best of k servers, i.e. minimize total time to transfer a document Issues to consider :
E N D
Evaluation of a Novel Two-Step Server Selection Metric Presented by Karthik Lakshminarayanan 11-26-2003
Problem statement • Goal: Client wants to download content from the best of k servers, i.e. minimize total time to transfer a document • Issues to consider: • Cost of choosing the target server • Lightweight mechanisms preferable • Stability of ordering (over a period of time) • More energy can be expended if stability is high • Nature of content and corresponding workloads • Frequency of downloads, and size of documents
Outline • Problem statement • Proposed algorithm • Existing/possible approaches • Methodology • Results
“Novel” two-step server selection • Pick k best servers out of the entire set by using pings (k ~ 5) • Retain the subset for a period of n days • Choose servers from the subset of k servers • Choose from this subset randomly • Can choose from subset based on other metrics • Call this Ping-twostep for convenience • Main delay due to network delays, not server load
Selection metrics • Dynamic metric (adapt to network condition) • Ping • Transfer of small files • Ping-twostep • Static metric (oblivious to network condition) • Number of hops • Number of AS hops • Random Summary: Ping-twostep performs best!
Methodology • Six client machines (USC, UNC, UCSC, Umass, UDel, Purdue) • 193 servers in tucows.com mirror network • Collected info continuously for 41 days • Each “run” comprised • 5 ICMP pings • Traceroute • Transfer times of files from 10KB – 1MB • More extensive set of servers than previous work
Comparison –Ping metric • RTT not always indication of transfer time • Not surprising! • Some oddities experienced with • UNC • Purdue • Relative positions between ping & 10k vary across nodes • Do not care about the low end of the bw spectrum!
Comparison – Small file transfers • Improved with size of transfer • Low correlation between time for small transfer vs. time for large transfers
Comparison – Static selection • Hop count • Mostly equivalent to random selection when used to estimate transfer time • Little correlation (restricted to USA and Canada)
Comparison – Static selection • Hop count • Mostly equivalent to random selection when used to estimate transfer time • Little correlation (restricted to USA and Canada)
Comparison – Static selection • Hop count • Mostly equivalent to random selection when used to estimate transfer time • Little correlation (restricted to USA and Canada) • AS hop count • Does not work well for them • Global IP-Anycast (GIA) uses this • Queried using BGP • Small hop counts miss many servers, large hop counts would result in too much traffic
Stability of server ranking • 70-98% of changes in rank are between zero and ten for top servers • Average servers experience much higher change in rank Rankings of top servers is stable
Stability of server transfer times • Consider different sizes of subsets of 193 hosts • Number of top servers in an n-subset is a small fraction of the size of subset (<10%) • Little overlap of top servers across clients • Consider a subset of servers • How many of them were ever • at the top in the 41-day period • Caveat: they consider only • the “top” server
Ping-random • Motivation revisited: • Ping technique • Low overhead • Good performance • Top servers stable over time • Choosing from the small subset: • Random – provides load-balance • Ping – use ping again among that set • Ping-best (for comparison)
Performance of Ping-Random • Ping-ping >~ Ping-random > (10k, Ping) • Ping-ping might not perform load-balance well
Effect of size of ping sets • Influenced greatly by the size of ping sets chosen • 40% of servers ever ranked first were in 20% of the pings
Effects of selection algorithms • Load-balancing • Different clients have different top servers • Oscillations • Respond to changing network conditions • “Fortunately, it is unlikely that many clients would be running tests at the same time” • No quantitative results!
Discussion • How do we use this in practice? • Useful for large file transfers • What about small web transfers? • GNP, Geoping approaches might work • Set of servers is static? • How can DHTs help in anycast? • DOLR network for proximity • Embed location information in Ids • Use longest-prefix matching tricks (like i3)