420 likes | 612 Views
COS497 - Cloud Computing 4. Businesses and the Cloud. What is Cloud Computing?. Cloud computing is a new computing paradigm, involving data and/or computation outsourcing, with “Infinite” resouces : elastic resource scalability On-demand: “just-in-time” provisioning of resources
E N D
COS497 - Cloud Computing • 4. Businesses and the Cloud
What is Cloud Computing? Cloud computing is a new computing paradigm, involving data and/or computation outsourcing, with • “Infinite” resouces: elastic resource scalability • On-demand: “just-in-time” provisioning of resources • No up-front cost … pay-as-you-go That is, use as much or as less as you need, use only when you want, and pay for only what you use.
Here are some of examples of how organizations, from research firms to large enterprises, use the Cloud today: • A large enterprise may quickly and economically deploy new internal applications, such as HR solutions, payroll applications, inventory management solutions, and online training to its distributed workforce. • An e-commerce website accommodates sudden demand for a “hot” product caused by viral buzz from Facebook and Twitter without having to upgrade its infrastructure. • A pharmaceutical research firm executes large-scale simulations using computing power provided by the cloud service provider. • Media companies serve unlimited video, music, and other media to their worldwide customer base.
Business Definition of Cloud Computing “Cloud Computing is the transformation of IT from a product to a service”
There are several characteristics that are essential for a service to be considered as part of the “Cloud”. These characteristics include; • On-demand, self-service. The ability for an end user to sign up and receive services without the long delays that have characterized traditional IT. • Broad network access. Ability to access the service via standard platforms (desktop, laptop, mobile, etc.). • Resource pooling (aka multi-tenancy). Cloud resources are pooled across multiple customers. • Rapid elasticity (aka scalability). Capability can scale to cope with demand peaks. • Measured service. Billing is metered and delivered as a utility service.
Cloud computing provides numerous economic advantages For clients: • No up-front commitment in buying/leasing hardware. • Can scaleusage according to demand. • Barriers to entry lowered for startups For providers: • Increased utilization of datacenter resources
We waste resources and capital when under-utilized. We generate customer dissatisfaction when resources are not enough for demand.
Cloud allows applications to scale – elastic resources. Use just what you need – no more, no less.
The biggest users of the Cloud are the cloud providers themselves! - Data storage – big data! - Text searching, analysis, processing Google, Amazon, Yahoo are the biggest users of the Cloud!
The Hype! Gartner in 2009 - Cloud computing revenue will soar faster than expected and will exceed $150 billion by 2013. It will represent 19% of IT spending by 2015. IDC in 2009- “Spending on IT cloud services will triple in the next 5 years, reaching $42 billion.” Forresterin 2010 – Cloud computing will go from $40.7 billion in 2010 to $241 billion in 2020. Companies and even Federal/state governments using cloud computing now: fedbizopps.gov
$$$ Dave Power, Associate Information Consultant at Eli Lilly and Company: “With AWS, a new server can be up and running in three minutes (It used to take Eli Lilly seven and a half weeks to deploy a server internally) and a 64-node Linux cluster can be online in five minutes (Compared with three months internally). … It's just shy of instantaneous!" Ingo Elfering, Vice President of Information Technology Strategy, Glaxo-Smith-Kline:“With Online Services, we are able to reduce our IT operational costs by roughly 30% of what we’re spending”. Jim Swartz, CIO, Sybase: “At Sybase, a private cloud of virtual servers inside its data center has saved nearly $2 million annually since 2006 because the company can share computing power and storage resources across servers.” Any business can harness large computing resources without buying its own machines.
What is a Cloud? • It’s a cluster! • It’s a supercomputer! • It’s a data store! • It’s superman! • None of the above • All of the above • Cloud = Lots of storage + compute cycles nearby
A single-site cloud (aka “Datacenter”) consists of - Compute nodes (aka servers) grouped into racks. - Network switches, connecting the racks. - A network topology, e.g. hierarchical – between servers and to outside world. - Storage (backend) nodes connected to the network - Front-end for submitting jobs, load balancing, … - Software Services A geographically-distributed cloud consists of - Multiple such sites - Each site perhaps with a different structure and services
A Sample Cloud DataCenter Topology Core Switch Top of the Rack Switch Rack Servers
Four major features: • Massive scale. • On-demand access: Pay-as-you-go, no upfront commitment. • = Anyone can access it • Data-intensive Nature: What was MBs has now become TBs, PBs and XBs. • - Daily logs, forensics, Web data, etc. • - Do you know the size of a Wikipedia dump? • New Cloud Programming Paradigms: MapReduce/Hadoop, NoSQL/Cassandra/MongoDB and many others. • - High in accessibility and ease of programmability • - Lots of open-source software Combination of one or more of these gives rise to novel and unsolved distributed computing problems in cloud computing.
Cloud Computing On-Premises IT Provision Physical Space Cabling Power Cooling Networking Racks Servers Storage Certification Labour $0 to Get Started No long-term contracts Versus
Elastic Capacity Pay Only For Only What You Use
= Available Capacity On-PremisesIT Delivery
High Growth Applications versus
Aperiodic Bursting Applications Even if you design your website infrastructure to handle peak loads, will it not be idle during other times?
On-Off Applications Researchers running large-scale scientific simulation using 1000s of computers. Why not rent computer time to run these simulations?
Periodic Applications Dynamic and flexible infrastructure can reduce costs and improve performance.
Example • Many hundreds of machines are involved in a single Google search request (remember, the web is 400+TB) • - Google has multiple clusters (of thousands of computers each) all over the world • - Your search query is routed to a nearby Google cluster
A Google cluster consists of “farms” of Google Web Servers, Index Servers, Document Servers, and various other servers (ads, spell checking, etc.) – many servers for different tasks. These are cheap standalone computers, rack-mounted, connected by commodity networking gear.
Within the cluster, load-balancersroute your search to a lightly-loadedGoogle Web Server (GWS), which will coordinate the search and response. The Google index is partitioned into “shards” - 64 MB blocks. • Each shard indexes a subset of the documents(i.e. web pages). • Each shard is replicated (There are three copies of each block of data), and can be searched by multiple computers– the “index servers”. • Each copy is stored on a different server running on a separate power supply. The blocks of data are distributed semi-randomly so that no two servers have the exact same collection of data blocks.
The Google Web Server routes your search to one index server associated with each shard, through another load-balancer. • When the dust has settled, the result is an ID for every document satisfying your search, rank-ordered by relevance.
The documents, too, are partitioned into “shards” – the partitioning is a hash on the document ID. Each shard contains the full text of a subset of the documents. Each shard can be searched by multiple computers – the “document servers”
The GWS sends appropriate document IDs to one document server associated with each relevant shard. • When the dust has settled, the result is a URL, a title, and a summary for every relevant document.
Meanwhile, the ad server has done its thing, the spell checker has done its thing, etc. Finally, the GWS builds an HTTP response to your search and ships it off Many hundreds of computers have enabled you to search 400+TB of web data in ~100 ms.
Cloud applications have • - Enormous volumes of data • - Extreme parallelism • - The cheapest imaginable components • - Failures occur all the time • - You could not afford to prevent this in hardware • Software makes it • - Fault-Tolerant • - Highly Available • - Recoverable • - Consistent • - Scalable • - Predictable • - Secure