190 likes | 286 Views
Computing on the Edge: A Platform for Replicating Internet Aplications. Michael Rabinovich and Zhen Xiao AT&T Labs -- Research. Content Distribution Networks. Currently serve static pages, streaming content, applications Can be served by client-side caches as well Unique CDN value:
E N D
Computing on the Edge: A Platform for Replicating Internet Aplications Michael Rabinovich and Zhen Xiao AT&T Labs -- Research
Content Distribution Networks • Currently serve static pages, streaming content, applications • Can be served by client-side caches as well • Unique CDN value: • Control over content • Overload “insurance” • Distributing applications
(Some of the) Challenges • Consistency of application replicas • Mutual consistency of application components in a replica • Deciding on how many replicas to deploy and where • Deciding on how to distribute requests among replicas
ACDN Components • Replication framework • Dynamically install and uninstall applications based on demand • Maintain replica consistency • Content placement algorithms • Request distribution algorithms
Replication Framework • Inspired by work on software distribution (e.g., Marimba) • Metafile for each application containing: • A list of time-stamped files (data and executable files) • An initialization script (or a pointer to it) FILE /home/applications/mapping/query_engine.cgi 1999.apr.14.08:46:12 FILE /home/applications/mapping/map_database 2000.oct.15.13:15:59 FILE /home/applications/mapping/user_preferences 2001.jan.30.18:00:05 SCRIPT mkdir /home/applications/mapping/access_stats setenv ACCESS_DIRECTORY /home/applications/mapping/access_stats ENDSCRIPT
Application Metafile • A metafile is a simple static Web page • Having a metafile is sufficient to deploy the application • Having the current metafile is sufficient to bring the application replica up-to-date. • Consistency of application replicas = consistency of cached copies of the metafile
Framework Tasks • Creating a replica • Obtaining a metafile • Obtaining a tar file with all files listed in the metafile • Running the initialization script • Updating a replica • Obtaining the diff of metafiles • Obtaining files that are new • Deleting a replica • Retaining the deleted replica for some time to process residual requests before physical deletion
Built on top of standard Web server: Apache + (Fast) CGI Uses standard HTTP throughout Architecture Overview
Algorithms • Request distribution algorithm • Load balancing among servers that have the requested application • Proximity of servers to requesting clients • Content placement algorithm • Distributed decision - at each CDN server • Total load on the server • Percentage of requests from other regions • Prediction of load after replication or migration • Placement costs: bytes transferred during replication vs. bytes transferred during responses
Algorithmic Challenges • Convergence • Moving replicas around • Load oscillations • Responsiveness and stability • Distributed vs. Centralized Algorithms • Interplay between request distribution and content placement
Load Balancing Algorithm Initial probabilities: Set prob(i) = 0 for all i Loop through the replicas in order of decreasing proximity if load(i) < LW prob(i) =1.0 exit else if LW <= load(i) < HW prob(i) = (HW – load(i)) / (HW – LW) Adjustments to account for proximity: remainder = 1.0 Loop through the replicas in order of decreasing proximity prob(i) = prob(i) * remainder remainder = remainder – prob(i) Final probabilities: prob(i) = prob(i) / sum of all
Content Placement Algorithm(run by each server) For each app: If demand below Deletion Threshold delete unless the only replica; If demand from another server’s region exceeds 50% of total and migration benefits are likely to exceed transfer overhead, try to migrate there; If demand from another server’s region exceeds Deletion Threshold and replication benefits are likely to exceed transfer overhead, replicate there; If server is overloaded: replicate or migrate some applications to the least-loaded server
Performance of Request Distribution • Three servers with decreasing proximity to all clients • Server 1 is the closest, server 2 is the next closest, server 3 is the fathest. • Start with low load, gradually increase to over server capacity, then decrease back
Load Distribution(10min bins) Standard iDNS Random aCDN
iDNS shows severe oscillations despite DNS caching aCDN stays within capacity; iCDN exceeds capacity Load Distribution (1000 req/sec capacity; 100s bins) aCDN iDNS
Performance of Content Placement • 53 servers (old UUNET topology) • 10% of servers are in “hot” regions with 90% of demand • 90% of servers are in “cold” regions with 10% of demand • Reshuffle hot and cold regions every 400 seconds, see how the system adapts
Related Work • Caching application results • No replication of computation • Assumes locality of reference • Utility computing (Ejacent, vMatrix, Grid computing) • Complex • Application-agnostic • Distributed file systems. Issues: • Replicating computing environment • Ensuring mutual consistency of application components
Conclusions • Utility computing – next CDN frontier • ACDN as first approach • Builds upon existing components (IDNS) • Automatic application deployment/decommissioning • Automatic consistency maintenance