160 likes | 311 Views
Content Distribution Network (CDN) Performance. Punit Shah (pshah@cse.ogi.edu) CSE581 Internet Technologies OGI, OHSU 2002, Jan 16th. Papers. CDN, CDN Performance The measured performance of CDNs. On the use and performance of CDNs.
E N D
Content Distribution Network (CDN) Performance Punit Shah (pshah@cse.ogi.edu) CSE581 Internet Technologies OGI, OHSU 2002, Jan 16th
Papers • CDN, CDN Performance • The measured performance of CDNs. • On the use and performance of CDNs. • Analytical model for CDN performance in multi-level caching. • Web caching and content distribution: A view from the interior. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
What is CDN ? • The CDNs are means to offload some or all of the (mainly static content) content delivery burden from the origin server. A replica server, which delivers content on behalf of the origin server is called a CDN server. • Aimed to address … • Client perceived latency (e.g. web browsers). • Capacity management of the server. • Caching as a side-effect. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Request Redirection • Primarily two ways to redirect request to the CDN servers. • DNS redirection Authoritative DNS server is controlled by the CDN infrastructure. Distributes the load to the various CDN servers depending whatever policy (e.g. round-robin, least loaded CDN server, geographical distance etc.) using DNS trick. • URL rewriting Main page still comes from the origin server, but URL for the embedded objects, e.g. images, clips are rewritten, which points to a any of the CDN server. Some vendors rewrite using hostname and some uses IP addr directly. Some vendors do employ a combination of these two methods. Not simple to find a nearest CDN server (in terms of latency). CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
GET index.html <HTML> … <HTML> 10.20.30.1 (not 111.222.100.1) IP for yahoo.com Full Site DNS redirection example Origin Server 111.222.100.1 10.20.30.1 www.yahoo.com/GET index.html 10.20.30.4 10.20.30.2 CDN controlled DNS Server 10.20.30.3 Vendors: Adero(Full), Akami and Digital Island (Partial) CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Partial DNS redirect/URL rewriting example index.html <HTML> <BODY> <A HREF=“/about_us.html”> About Us </A> <IMG SRC=“www.clearway1.net/www.yahoo.com/img1.gif”> <IMG SRC=“www.clearway2.net/www.yahoo.com/img2.gif”> <IMG SRC=“10.20.30.2/www.yahoo.com/img3.gif”> </BODY> </HTML> Vendors:Clearway (URL RW) CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
CDN performance elements • Client perceived latency. • That’s what most of the papers focused, as an outsider. • Load balancing among the CDN servers. • Number of request offloaded from an origin server. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Analytical Model • Gadde et al. derives CDN cachable ratio as (Cni - Cnl)/(1 - Cnl) • where • Cni = CDN hit ratio for client population of size ‘ni’ who forwards to this CDN server for some fixed object ‘x’ • Cnl = cache hit ratio at leaf node (e.g. proxy) serving client population of size ‘nl’ CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Model Performance • More clients, less CDN cache hit ratio. • If number of clients increased further, curve take a bell shape, indicating cache ‘thrashing’. • Model validated with the NLANR cache hierarchy at the ‘root’ level (considering all root level cache as an unified cache). 32% cache hit ratio in Oct 1999. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
CDN Server Selection • Primary paper [Johnson et al.] focuses on how ‘good’ (good == minimal client latency) CDN server is selected by the Akamai and Digital Island. Both of these uses partial site DNS lookup. • Used three distinct client locations in the US. Two east coast and one western state. Clients were running different OS and different internet bandwidth. • Test Procedure • Determine set of CDN servers (hostnames) used by the particular CDN. • Obtain IP address of the CDN servers. • Identify a GIF file (3-4KB), and retrieve this GIF from each of the CDN servers 25 times. Record time taken. Notice that DNS lookup time is not a factor, as IP addrs are used.This test was conducted at all three client sites. • Fetch same GIF via CDN server identified by contacting an origin server. Record time taken. Modified gethostbyname()? or /etc/resolv.conf order. Because TTL was quite small (10s of seconds). This tests were also conducted at all three client sites for both of these vendors. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Results • Both vendors demonstrated identical results. • Not very best CDN is chosen at some locations. • Performance is highly location dependent. Some location performed much better than the others. Indicating CDN server placement. • However >90% times reasonably good server, with respect to particular location is chosen. • For around 10% of times, rather random choice would done better. • Conclusion: Doesn’t choose an optimal CDN server, but avoids notably bad CDN server. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Another location CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Other Results • Focus is to compare Sep 2000 and Jan 2001 results. • CDN server selection test results are identical to the what we saw earlier. • HTTP/1.1 results are better than HTTP/1.0 parallel connection. V1.1 pipeline is faster than serial. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Load balancing and DNS Lookup Overhead • Till now we ignored DNS lookup time to focus on measuring quality of the CDN server chosen. • However not an insignificant overhead. Esp. considering very small download time and TTL, e.g. Adero 10sec, Akamai and Digital Island 20sec. TTL for non-CDN origin site, cnn.com 15min, espn.com 6hours. • Bala et al. conducted a test to measure DNS lookup overhead (and latency) introduced by the CDN load balancing mechanism. • Test procedure • Store (fixed) IP addr for each CDN server at every 8 hours. • During this 8 hours period, at every 30 mins., compare new IP returned with previously retrieved (fixed) IP addr. • Access DNS lookup time and download time for new IP addr returned. • Compare download time with fixed IP addr. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Results • In Jan 2001, 15% (Fasttide) to 70%(Digital Island) time new IP is same as fixed. • In above cases a new IP download time is identical to the fixed IP, but DNS lookup overhead undermines overall performance. • 10% of times, download from new IP addr is faster, but again DNS lookup … • 30-40%(Akamai) times new download time is more then a fix IP addr, again DNS lookup ... • New download time are more than fixed IP addr download time. • Overall redirection is not efficient. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)
Some Facts ... • CDN mainly used for image files (static contents). • Content server by the CDN is a static in the nature. Only 0.3% content changed for existing URLs and at the most 13% new URLs were introduced. • Black-box performance testing. So no data about load-balancing, only latency. • Large increase in deployment in the CDN between Nov 99 (only 1-2% of top 670 sites) and Dec 2000 (25% of the popular sites). • Akamai seems to be most popular CDN vendor. • Images are 96-98% of the CDN served contents. But only 40-46% of the CDN-served bytes. Rest is dynamic content ? • CDN images cache-hit rate is 30-80%. Only 25-60% for non-CDN served. • Needs to map IP addrs with the geography for better CDN server selection. • CDNs can not used for something that involves authentication etc. CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu)