280 likes | 443 Views
Evaluation of the Proximity between Web Clients and their Local DNS Servers. Z. Morley Mao UC Berkeley (zmao@eecs.berkeley.edu) Chuck Cranor, Fred Douglis, Michael Rabinovich, Oliver Spatscheck, and Jia Wang AT&T Labs--Research. Motivation. Content Distribution Networks (CDNs)
E N D
Evaluation of the Proximity between Web Clients and their Local DNS Servers Z. Morley Mao UC Berkeley (zmao@eecs.berkeley.edu) Chuck Cranor, Fred Douglis, Michael Rabinovich, Oliver Spatscheck, and Jia Wang AT&T Labs--Research
Motivation • Content Distribution Networks (CDNs) • Try to deliver content from servers close to users • Current server selection mechanisms • Uses Domain Name System (DNS) • Assumes that clients are close to their local DNS servers – “orginator problem” Verify the assumption that clients are close to their local DNS servers
Measurement setup • Three components • 1x1 pixel embedded transparent GIF image • <img src=http://xxx.rd.example.com/tr.gif height=1 width=1> • A specialized authoritative DNS server • Allows hostnames to be wild-carded • An HTTP redirector • Always responds with “302 Moved Temporarily” • Redirect to a URL with client IP address embedded
1. HTTP GET request for the image 2. HTTP redirect to IP10-0-0-1.cs.example.com Client [10.0.0.1] Redirector for xxx.rd.example.com 7. HTTP GET request for the image 8. HTTP response 6. Reply: content server IP address 3. Request to resolve IP10-0-0-1.cs.example.com Content server for the image 4. Request to resolve IP10-0-0-1.cs.example.com 5. Reply: IP address of content server Name server for *.cs.example.com Local DNS server Embedded image request sequence
Measurement impact • Image (43 Byte) embedded at the end of the page, requested last • Keynote measurement Average download latency (sec)
Proximity metrics: 1. AS, 2. network clustering • AS clustering • Observes if client and LDNS belong to the same AS • Network clustering • Network cluster based on BGP routing information using longest prefix match • Observes if client and LDNS belong to the same network cluster
client Local DNS server Proximity metric:3. traceroute divergence Probe machine a • Use the last point of • divergence • Traceroute divergence: • Max(3,4)=4 b 1 1 2 2 3 3 4
Proximity metric:4. Roundtrip time correlation • Correlation between message roundtrip times from a probe site to the client and its LDNS server • The probe site represents a potential cache server location • A crude metric, highly dependent on the probe site
Aggregate statistics of AS/network clustering • About 12,000 Ases • Observed close to 80% total ASes • 440,000 unique prefixes • 25% of all possible network clusters
Proximity analysis results:AS, network clustering • AS clustering: coarse-grained • Network clustering: fine-grained • Most clients not in the same routing entity as their LDNS • Clients with LDNS in the same cluster slightly more active
Proximity analysis results:Traceroute divergence • Probe sites: • NJ(UUNET), NJ(AT&T), Berkeley(calren), Columbus(calren) • Sampled from top half of busy network clusters • Median divergence: 4 • Mean divergence: 5.8-6.2 • Ratio of common to disjoint path length • 72%-80% pairs traced have common path at least as long as disjoint path
Improved local DNS configuration • For client-LDNS associations not in the same cluster, does there exist a LDNS in client’s cluster? Client IPs HTTP requests
Clients using multiple LDNS • A single client IP can be associated using multiple LDNS • First LDNS times out, second contacted • LDNS assigned dynamically through DHCP server • LDNS configuration with multiple IPs • Client IP reused by different users • Client IP is the address of NAT or proxy • Misconfiguration • Majority of clients are associated with a single LDNS – 78%
Client IPs using large number of LDNSs • Common domain names: (30-241 LDNS) • *.MIL, apnc*, *bbnplanet.com, *hsacorp.net, *webcache.rcn.net, cache*.webcache.rcn.net, cache0*.proxy.aol.com, cache.brightok.net, cache*.ruh.isu.net.sa, *.onenet.net, hh*.direcpc.com, cob-cache.r.state.mn.us, mango.arctic.net, netcache.net.ca.gov, proxy.*.netsetter.com, *.nortelnetworks.com, rad.afonline.net, *.prserv.net, *.cisco.com, ss*.co.us.ibm.com, thing5.csc.com, *.wwwcache.ja.net
Example client IP using large number of LDNSs • Client • 216.34.56.12 (proxy.sjc.netsetter.com) • Using 241 LDNS • 753 requests • Belong to marketscore.com: • Offers free browser plug-in for web acceleration • Using client’s LDNS to do name resolution on behalf of client? • HTTP headers: • Via header: NetCache Network Appliance • X-forwarded-for: 10.104.1.115, 10.104.1.31 • Client-ip: client IP address (dialup customers)
Impact on commercial CDNs • Impact on server selection accuracy • Look for clients • With LDNS responds to queries • With a cache server in client’s cluster • Whether directed to a cache server in a different cluster? – “misdirected”
Why choosing a cache in a different cluster? • Even when both client and LDNS are in the same cluster? • Possible reasons • Load-balancing algorithms using different metrics • E.g., network access costs • Caches are different • Clustering too coarse-grained • CDN mapping inaccuracies?
Lessons from study of commercial CDNs • AS hop count is a bad metric for closeness evaluation • too coarse-grained • Maybe better choosing a geographically closer cache server in a different AS • For load-balancing, fault-tolerance, CDNs sometimes return cache servers in two different Ases
Related work • Measurement methodology • IBM (Shaikh et al.) • Time correlation of DNS and HTTP requests from DNS and Web server logs • Univ of Boston (Bestavros et al.) • Assigning multiple IP addresses to a Web server • Differences from our work: • Our methodology: efficient, accurate, nonintrusive • Web bugs • Proximity metrics • Cisco’s Boomerang protocol: uses latency from cache servers to the LDNS
Conclusion • Novel technique for finding client and local DNS associations • Fast, non-intrusive, and accurate • DNS based server selection works well for coarse-grained load-balancing • 64% associations in the same AS • 16% associations in the same NAC • Server selection can be inaccurate if server density is high