1 / 19

Computing on the Edge: A Platform for Replicating Internet Aplications

Computing on the Edge: A Platform for Replicating Internet Aplications. Michael Rabinovich and Zhen Xiao AT&T Labs -- Research. Content Distribution Networks. Currently serve static pages, streaming content, applications Can be served by client-side caches as well Unique CDN value:

cassia
Download Presentation

Computing on the Edge: A Platform for Replicating Internet Aplications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing on the Edge: A Platform for Replicating Internet Aplications Michael Rabinovich and Zhen Xiao AT&T Labs -- Research

  2. Content Distribution Networks • Currently serve static pages, streaming content, applications • Can be served by client-side caches as well • Unique CDN value: • Control over content • Overload “insurance” • Distributing applications

  3. (Some of the) Challenges • Consistency of application replicas • Mutual consistency of application components in a replica • Deciding on how many replicas to deploy and where • Deciding on how to distribute requests among replicas

  4. ACDN Components • Replication framework • Dynamically install and uninstall applications based on demand • Maintain replica consistency • Content placement algorithms • Request distribution algorithms

  5. Replication Framework • Inspired by work on software distribution (e.g., Marimba) • Metafile for each application containing: • A list of time-stamped files (data and executable files) • An initialization script (or a pointer to it) FILE /home/applications/mapping/query_engine.cgi 1999.apr.14.08:46:12 FILE /home/applications/mapping/map_database 2000.oct.15.13:15:59 FILE /home/applications/mapping/user_preferences 2001.jan.30.18:00:05 SCRIPT mkdir /home/applications/mapping/access_stats setenv ACCESS_DIRECTORY /home/applications/mapping/access_stats ENDSCRIPT

  6. Application Metafile • A metafile is a simple static Web page • Having a metafile is sufficient to deploy the application • Having the current metafile is sufficient to bring the application replica up-to-date. • Consistency of application replicas = consistency of cached copies of the metafile

  7. Framework Tasks • Creating a replica • Obtaining a metafile • Obtaining a tar file with all files listed in the metafile • Running the initialization script • Updating a replica • Obtaining the diff of metafiles • Obtaining files that are new • Deleting a replica • Retaining the deleted replica for some time to process residual requests before physical deletion

  8. Built on top of standard Web server: Apache + (Fast) CGI Uses standard HTTP throughout Architecture Overview

  9. Algorithms • Request distribution algorithm • Load balancing among servers that have the requested application • Proximity of servers to requesting clients • Content placement algorithm • Distributed decision - at each CDN server • Total load on the server • Percentage of requests from other regions • Prediction of load after replication or migration • Placement costs: bytes transferred during replication vs. bytes transferred during responses

  10. Algorithmic Challenges • Convergence • Moving replicas around • Load oscillations • Responsiveness and stability • Distributed vs. Centralized Algorithms • Interplay between request distribution and content placement

  11. Load Balancing Algorithm Initial probabilities: Set prob(i) = 0 for all i Loop through the replicas in order of decreasing proximity if load(i) < LW prob(i) =1.0 exit else if LW <= load(i) < HW prob(i) = (HW – load(i)) / (HW – LW) Adjustments to account for proximity: remainder = 1.0 Loop through the replicas in order of decreasing proximity prob(i) = prob(i) * remainder remainder = remainder – prob(i) Final probabilities: prob(i) = prob(i) / sum of all

  12. Content Placement Algorithm(run by each server) For each app: If demand below Deletion Threshold delete unless the only replica; If demand from another server’s region exceeds 50% of total and migration benefits are likely to exceed transfer overhead, try to migrate there; If demand from another server’s region exceeds Deletion Threshold and replication benefits are likely to exceed transfer overhead, replicate there; If server is overloaded: replicate or migrate some applications to the least-loaded server

  13. Performance of Request Distribution • Three servers with decreasing proximity to all clients • Server 1 is the closest, server 2 is the next closest, server 3 is the fathest. • Start with low load, gradually increase to over server capacity, then decrease back

  14. Load Distribution(10min bins) Standard iDNS Random aCDN

  15. iDNS shows severe oscillations despite DNS caching aCDN stays within capacity; iCDN exceeds capacity Load Distribution (1000 req/sec capacity; 100s bins) aCDN iDNS

  16. Performance of Content Placement • 53 servers (old UUNET topology) • 10% of servers are in “hot” regions with 90% of demand • 90% of servers are in “cold” regions with 10% of demand • Reshuffle hot and cold regions every 400 seconds, see how the system adapts

  17. Bandwith Consumption

  18. Related Work • Caching application results • No replication of computation • Assumes locality of reference • Utility computing (Ejacent, vMatrix, Grid computing) • Complex • Application-agnostic • Distributed file systems. Issues: • Replicating computing environment • Ensuring mutual consistency of application components

  19. Conclusions • Utility computing – next CDN frontier • ACDN as first approach • Builds upon existing components (IDNS) • Automatic application deployment/decommissioning • Automatic consistency maintenance

More Related