160 likes | 256 Views
Handling Flash Crowds from your Garage. Peter Ward May 1, 2009. Motivation. Web Development Cloud Computing. Purpose. To inform and explain ways to deploy robust web services; the differences between these ways; how they work in real life. Key Information. The Problem The Solution
E N D
Handling Flash Crowds from your Garage Peter Ward May 1, 2009
Motivation • Web Development • Cloud Computing
Purpose • To inform and explain • ways to deploy robust web services; • the differences between these ways; • how they work in real life.
Key Information • The Problem • The Solution • Scaling Technologies • Flash Crowd Experiences
The Problem • Innovation by small companies and individuals. • Specific need + small website= Web Service • Gain popularity through Slashdot, Digg and Reddit. • But large crowds crash servers, losing your service users. • More powerful servers can make handle more users, but they cost more. • The problem is often not processing speed, but disk retrieval speed and bandwidth.
A Distributed Solution • “Utility Computing”: Resources are utilities – you only need to pay for what you use. • (e.g.: processing time, bandwidth, memory and disk space) • No powerful servers – just “virtual computers”. • Low cost for low demand, high cost for high demand – scalable.
Storage Delivery Networks • Use Case: Sites storing or providing large amounts of static • content such as photos or videos. • Examples: Amazon’s S3, the Nirvanix platform. • Scope: Static HTTP • Potential for Failure: Low - the SDN is responsible for handling the load.
Distributed Computing • Use Case: Dynamic content. • For dynamic content, we need to run server-side applications. • Virtual machines • owned by the company (large companies) • “Compute Clouds” (small companies) • Developed on a single machine • Deployed to any number of virtual machines. • Examples: Amazon’s EC2, FlexiScale.
HTTP Redirection • Single front-end machine redirects using HTTP to one of many back-end machines. • Clients only need to contact this machine once. • If a large number of users are accessing the front-end server, it could become overloaded, and prevent new users. • This can be prevented using DNS Load Balancing. • Use Case: Large number of internal servers running web • servers. • Examples: Hotmail. • Scope: HTTP • Potential for Failure: High - front-end failure, but can be reduced by combining approaches.
IP Load Balancing • Large number of back-end servers appear as a single network address. • L4 load balancing runs at IP level, choosing a new server for each request. • L7 load balancing runs at protocol level (i.e.: HTTP), so it inspects the headers of the request, and distributes the request to a server based on the request. • Use Case: Provides balancing for any number of back-end • servers. • Examples: Linux Virtual Server, Microsoft Internet Security • and Acceleration Server, FlexiScale • Scope: Any network server. • Potential for Failure: Medium – the front-end balancer is most susceptible to failure. Could be combined with other approaches.
DNS Load Balancing • DNS Servers provide clients with a list of IP addresses to choose from. • To add/remove servers, the company just has to update the list of IP addresses. • Unfortunately, not all clients choose a random server from the list of addresses – leading to an uneven balancing. • Use Case: Provides balancing for any number of back-end • servers. • Examples: Google, Hotmail, Yahoo and many more. • Scope: Any network server. • Potential for Failure: Medium - DNS servers are robust, but updates are typically slow.
MapCruncher • A new web authoring tool that makes it easy for non-experts to convert their own maps into AJAX-style interactive maps. • No server-side behaviour - all processing done with JavaScript. • 25GB data on relatively powerful server. • After release, the service could only handle 100 requests per second - slow disk retrieval speed. • Data moved to Amazon’s S3 service.
Asirra • Asirra is a CAPTCHA system which asks users to identify photos as either cats or dogs. • Written in Python. • Optimised for speed. • Virtual Machines - Amazon’s EC2. • DNS load balancing. • In the first 24 hours after release... • 75,000 real requests • 30,000 from a denial-of-service attack
InkblotPassword.com • A website that helps users generate and remember high-entropy passwords, using Rorschach-like images as a memory cue. • Written in Python • No optimisation – servers are cheap. • Nothing stored on the local disk. • DNS Load Balancing (but not automated) • Handling a Slashdot crowd cost less than $150.
Summary • Small companies have a range of options for deploying • robust services with small budgets. • For applications focussed mainly on static content, a Storage • Delivery Network (SND) such as Amazon’s S3 is a good • choice as it requires virtually no setup or understanding of • the load distribution used internally in the SND. • Compute Clouds are useful for deploying low-cost servers • onto virtual machines, and being able to add or remove • servers as needed. • Technologies such as HTTP Redirection, L4/L7 Load • Balancing and DNS Load Balancing are useful when • combined with virtual machines, as they allow the load of a • single domain to be spread across multiple servers, for a • relatively small cost.
Evaluation • The article is a well-presented analysis of the low-cost • options for providing scalable web services. Whilst the • writing style is very accurate and informational, the structure • of the article is somewhat disjointed - there are too many • irrelevant sections, and technologies have been put into • sections for purely cosmetic reasons. • The case studies reported by the article are good examples • of how this technology can be employed, and are useful in • keeping the reader’s interest in the article. • The sources and data of the article are of the expected high • quality for a professional paper.