420 likes | 455 Views
Cloud Computing and Migrating your systems to the Cloud. Bina Ramamurthy Computer Science and Engineering Department University at Buffalo (UB) This work is partially funded by grants NSF-DUE-CCLI-Phase II: 0920335 and NSF-CI-TEAM: 1041280 bina@buffalo.edu
E N D
Cloud Computing and Migrating your systems to the Cloud MCLS Presentation, Ramamurthy Bina Ramamurthy Computer Science and Engineering Department University at Buffalo (UB) This work is partially funded by grants NSF-DUE-CCLI-Phase II: 0920335 and NSF-CI-TEAM: 1041280 bina@buffalo.edu http://www.cse.buffalo.edu/faculty/bina
Cloud: The next Generation Computer? • Vacuum tube computer • Transistors based computer • Integrated circuits • Main frame • Minicomputer • Microcomputer • Desk tops, PCs, Laptops, Palmtops • Internet and the web, web services • Mobile devices (cell phone and PDA) • Cloud computing MCLS Presentation, Ramamurthy
Evolution of Internet Computing deep web scale Data-intensive HPC, cloud web Semantic discovery ?????? Automate (discovery) MCLS Presentation, Ramamurthy Discover (intelligence) Transact Integrate Interact Inform Publish time
Evolution • Business • Automation • Remote operations • Heterogeneity • Scale • Integration (application, data) • E-commerce, … • Computing • Programming languages • RISC vs. CISC architectures • Memory capacity • Computing power • Simple programObject-orientationComponent… MCLS Presentation, Ramamurthy
Evolution (contd.) • Information technology • Internet, World-wide web & Grid • Mobile and wireless • Devices • Software, platforms • Search engines • Tremendous advances in application: it is an app-world • Environment and Society • Accessibility • Globalization (outsourcing, markets) • IT users not exclusive to Computer Science • Digital media • ipod, iphone, idog, ipad, … • Youtube, myspace, social networking • Blogs,wikies, podcasts • Facebook, orkut, twitter MCLS Presentation, Ramamurthy
It is a changed world now… • Explosive growth in applications: biomedical informatics, space exploration, business analytics, web 2.0 social networking: YouTube, Facebook • Extreme scale content generation: e-science and e-business data deluge • Extraordinary rate of digital content consumption: digital gluttony: Apple iPhone, iPad, Amazon Kindle • Exponential growth in compute capabilities: multi-core, storage, bandwidth, virtual machines (virtualization) • Very short cycle of obsolescence in technologies: Windows Vista Windows 7; Java versions; CC#; Phython • Newer architectures: web services, persistence models, distributed file systems/repositories (Google, Hadoop), multi-core, wireless and mobile • Diverse knowledge and skill levels of the workforce • You simply cannot manage this complex situation with your traditional IT infrastructure MCLS Presentation, Ramamurthy
Top Ten Largest Databases MCLS Presentation, Ramamurthy Ref: http://www.businessintelligencelowdown.com/2007/02/top_10_largest_.html 02/28/09 7
Challenges • We need to: • Address the scalability issue: large scale data, high performance computing, automation, response time, rapid prototyping, and rapid time to deployment • Effectively address (i) ever shortening cycle of obsolescence, (ii) heterogeneity and (iii) rapid changes in requirements • Transform data from diverse sources into intelligence and deliver intelligence to right people/user/systems • Spread the benefits of computing to the society/common person • Align with the needs of the business / user / non-computer specialists / community and society/ the net- generation MCLS Presentation, Ramamurthy
The Big Picture • Cloud provides computing on-demand. • You can set up and go with an instance of a Windows, Linux (various flavors), or machines with other operating systems (old and new) • It can provide attached volume storage • It can provide unlimited storage accessible by web services • It can provision a cluster of computers on-demand to perform sophisticated functions. MCLS Presentation, Ramamurthy
the cloud • Cloud computing is Internet-based computing, whereby shared resources, software and information are provided to computers and other devices on-demand, like the electricity grid. • The cloud computing is a result of numerous attempts at large scale computing with seamless access to virtually limitless resources. • on-demand computing, utility computing, ubiquitous computing, grid computing, .. • It delivers computational infrastructure and services to the masses similar to electricity and water… • Budget-wise it is a shift from capital investment (upfront cost) to operational cost MCLS Presentation, Ramamurthy
Topics for Discussion • Cloud Computing as services • Cloud Models • Capabilities of cloud (demos) • Sample Cost Analysis • Analyzing your organization needs for cloud deployment • Some general observations • Potential benefits • Limitations • UB’s involvement: A State-level Certificate Program in Data-intensive Computing (pending SUNY approval) • References • Conclusion MCLS Presentation, Ramamurthy
The Cloud Computing as Services • Typical operational models: • Infrastructure as a Service (IaaS): Amazon Web Services (aws.com) • Platform as a Service (PaaS): Windows Azure • Software as a Service (SaaS). Ex: Google docs, Turbo Tax online (Google App Engine) • Services-based application programming interface (API) • A cloud computing environment can provide one or more of the above requirements for a cost • Pay as you go model of business • When using a public cloud the model is similar to renting a property than owning one. • An organization can also evolve its own IT infrastructure to a cloud model (private cloud model) MCLS Presentation, Ramamurthy
Cloud Models • Azure, EC2 and GAE are complementary models: • In EC2 you can work at the low levels and have control over the infrastructure • Azure is abstract in that it provides a fabric of compute cycles and storage and operates at the enterprise level • GAE is a Python-based environment for design, development and deployment of applications • All have well-defined pricing model that is quite reasonable • GoGrid and Rackspace are two other cloud companies that primarily feature cloud data centers MCLS Presentation, Ramamurthy
Windows Azure • Enterprise-level on-demand capacity builder • Fabric of cycles and storage available on-request for a cost • You have to use Azure API to work with the infrastructure offered by Microsoft • There is a development environment where you develop and test the application and upload it to the production environment on the cloud. • Special features are blob storage, web roles and worker roles. • Provides Azure SQL and other features for easy migration of Windows users to the cloud. MCLS Presentation, Ramamurthy
Amazon EC2 • Amazon EC2 is one large complex web service. • EC2 provided an API for instantiating computing instances with any of the operating systems supported. • It can facilitate computations through Amazon Machine Images (AMIs) for various other models. • Signature features: S3, Cloud Management Console, MapReduce Cluster, GPU cluster, Amazon Machine Image (AMI) • Lets look at some of the features of AWS: • On-demand provisioning (machine instance, storage, applications, …) • Scalability (auto scaling / load balancing) • Availability (through redundancy) • Rapid prototyping and deployment • Elastic resources (including bandwidth) MCLS Presentation, Ramamurthy
AWS Features (demos) • Elastic Compute Cloud (EC2) • AWS Cloud Front • Amazon Machine Images (AMI) • Linux and Windows images • Simple Storage Service (S3) • Elastic Block Volumes (ESB) (mounted/attached volumes) • Load balancers • Support for Data-intensive computing : Ex: Hadoop MapReduce ; GPU computing; Other applications to come soon • Reserved and spot instances (besides the regular on-demand instances) MCLS Presentation, Ramamurthy
Google App Engine • This is more a web interface for a development environment that offers a one stop facility for design, development and deployment Java and Phython-based applications in Java and Phython. • Google offers the same reliability, availability and scalability at par with Google’s own applications • Interface is software programming based • Comprehensive programming platform irrespective of the size (small or large) • We developed an Adobe Flash Evolutionary Evolutionary Biology tool for teaching freshman biology. (NSF CI-TEAM Grant) • This scenario is a serendipitous invaluable experience for us about the capability of the cloud. MCLS Presentation, Ramamurthy
Google App Engine Load Monitoring on 10/11 MCLS Presentation, Ramamurthy
MemCache on GoogleAppEngine on 10/11 MCLS Presentation, Ramamurthy Memcache partial unavailability
Simple Cost Model: for Amazon Cloud • 0.10 c per hour if you leave the Linux instance on: after 10 hours you terminate it, you pay $1.00 • http://aws.amazon.com/ec2/pricing/ • See also http://calculator.s3.amazonaws.com/calc5.html • On page 49: • 0.10/CPU-hour : one load balancer • 0.40/CPU-hour: 2 application servers • 0.80/CPU-hour: 2 database servers $2.40+ 44.00+38.40 = 84.80 per day for a typical scenario leading to $30,952 per year. This + software licenses (if you use yours) + management tools (cloud monitoring) + labor (who prepares and loads stuff on the cloud). MCLS Presentation, Ramamurthy
Sample Scenario • Traditional: • Half rack at a reliable ISP with sufficient bandwidth to support your needs • Two good firewalls • One hardware load balancer • Two good GB Ethernet switches • Six solid, commodity business servers • The cloud option: • One medium 32-bit instance • Four large 64-bit during standard usage to meet peek demands • Assume open source software and services • Costs for time for setting up environments, monitoring services, labor for management of environment. • Table shown next gives the upfront and ongoing costs. MCLS Presentation, Ramamurthy
Cost Analysis • Costs associated with different infrastructures (I – initial, M-Monthly) Internal-I Cloud-I Internal-M Cloud-M Rack $3,000 $0 $500 $0 Switches $2,000 $0 $0 $0 Load balancer $20,000 $0 $0 $73 Servers $24,000 $0 $0 $1,206 Firewalls $3,000 $0 $0 $0 24/7 Support $0 $0 $0 $400 Mgt. software $0 $0 $100 $730 Expected labor $1,200 $1,200 $1,200 $600 Degraded.perf $0 $0 $100 $0 =============================================== Totals $53,200 $1,200 $1,900 $3009 MCLS Presentation, Ramamurthy
Cost Comparison • 112,083 (internal) vs 94,452 (cloud) • When the traffic patterns are static and steady you may not need the cloud • Cost savings are tremendous when the variance between peak and average increases, and between average and low increases. • Excellent case: our tool POP!World on the Google App Engine MCLS Presentation, Ramamurthy
Software Licenses • Cloud environments come with costs bundled with the instances for the common operating systems and software packages: • Example: Windows, MySQL, Linux versions.. • Cloud is an impetus to work with freeware and open source. • Open source is ideal for the cloud • Flexibility of the open source made amazon cloud possible • Beyond, open source best licensing model is the one charges by CPU-hour • Amazon has recently introduced a feature where you can compute with licenses you purchased; bring your own license (BYOL): This may be useful for your CARL license • Restricted software licenses are not good for cloud environment: • Per user licensing that requires validation against a server, auditing and such • Lesson: Make sure you understand the licensing for the products you use MCLS Presentation, Ramamurthy
IT infrastructure at Monroe County Library Systems (MCLS) 1. Microsoft Operating Systems. MS Win Server 2003, Win Server 2008. Couple of Solaris 8/9 devices. Linux on one Content Filter. 2. At the desktop level Windows XP, Vista and Windows 7. 3. Applications we use Office 2007 (/ 2010)….Word, Excel, Access, Power Point, Publisher, Outlook (Exchange Mail). 4. Symantec Endpoint Anti-Virus. 5. Marshall 8e6 Content Filter. 6. Major App. Our Integrated Library System is CARL. It is a TLC (The Library Corporation) product. 7. HP SAN ~12 tb capacity (previously Left-hand, purchased by HP). 8. HP shop for servers and desktops, we use 3Com (also purchased by HP) for the Switches and Core. 9. Fortinet, Fortigate 1000A Firewall. 10. WAN has ~2000 devices on it. MCLS Presentation, Ramamurthy
IT infrastructure at Monroe County Library Systems (MCLS) 1. Microsoft Operating Systems. MS Win Server 2003, Win Server 2008. Couple of Solaris 8/9 devices. Linux on one Content Filter. 2. At the desktop level Windows XP, Vista and Windows 7. 3. Applications we use Office 2007 (/ 2010)….Word, Excel, Access, Power Point, Publisher, Outlook (Exchange Mail). 4. Symantec Endpoint Anti-Virus. 5. Marshall 8e6 Content Filter. 6. Major App. Our Integrated Library System is CARL. It is a TLC (The Library Corporation) product. 7. HP SAN ~12 tb capacity (previously Left-hand, purchased by HP). 8. HP shop for servers and desktops, we use 3Com (also purchased by HP) for the Switches and Core. 9. Fortinet, Fortigate 1000A Firewall. 10. WAN has ~2000 devices on it. MCLS Presentation, Ramamurthy
Analysis of the MCLS (High level analysis) Amazon Cloud CARL on Linux server + Oracle server instances #6 S3 Media Archive #7 WS EBS SAN instance EBS MCLS Presentation, Ramamurthy EBS #7 #8 #1 #1 #7 HP, Solaris and Windows servers Amazon Firewall #2,3,8,10 Content filter #5 #4 Local desktops, wan devices, mobile devices
Roadmap to migration to cloud • Stakeholders (in IT) get familiarized with capabilities of the cloud capabilities, services, features, models and accessibility • Prototype a small project to understand the working • Evaluate the cost, benefits and assess the experiences of the MCLS users • Involve the license holders/vendors of the major applications • Design and launch the cloud deployment in stages gradually MCLS Presentation, Ramamurthy
Sample Cloud costs • Windows Server 2003 instance: • High-CPU Extra Large Instance 7 GB of memory, 20 EC2 Compute Units, 1690 GB of local instance storage, 64-bit platform, • Cloud Price: $1018.92 (Total Annual Cost) • Expected Utilization per month: 10% • Usage pricing: $1.16/hr • Monthly payment: $84.91 • Internal Price: OS license fee ($170) + Server H/W ($1486): $1656.00 • OpenSolaris 2009.06 instance: • High-Memory Extra Large Instance 17.1 GB memory, 6.5 ECU (2 virtual cores with 3.25 EC2 Compute Units each), 420 GB of local instance storage, 64-bit platform • Cloud Price: $1317.60 (Total Annual Cost) • Expected Utilization per month: 30% • Usage pricing: $0.50/hr • Monthly payment: $109.80 • Internal Price: SPARC T3-1 Server: $18,639.00 • Thanks to Raj Chakraborty, my student for the numbers. MCLS Presentation, Ramamurthy
Modified Operational Model and suggestions for cost saving • Evaluate the usage pattern and adjust the computational resource deployed: • the systems are used by only Monroe county residents or user in Monroe county, NY • Usage is almost 0 during nights (8 hours? 1/3 of the day)? • Usage pattern shows sporadic use on certain holidays? • Employee use only during work hours (8-10 hours a day?)? • It is possible to work out the cost estimates, savings and effort involved. • These will ultimately decide the degree of cloud deployment. MCLS Presentation, Ramamurthy
Potential Benefits • Scalability • Availability • Automation through web services (SOAP and REST) • Newer programming and application models • Accessibility to masses through commoditization • Organic disaster mitigation and recovery • Green Computing • Rapid prototyping and deployment for newer computing models such as “MapReduce” • Obliterates digital divide • Great concept to draw in and engage the net-generation • Not so steep learning curve (quite intuitive) • Cost saving MCLS Presentation, Ramamurthy
Scalability through Elastic Resources • Move, shift, balance. load, unload, juggle • You can request only as many resources as you want and return them when you no longer need them • Automatic load balancing possible • On-demand provisioning of resources • Most importantly turn-on and turn-off whenever • Economies of scale: Increases use/production leads to increased efficiency and lowers unit cost • Large organization have perfected the management and administration of large scale infrastructure for their own demanding operations; why not sell this expertise? MCLS Presentation, Ramamurthy
Availability • Availability: Amazon.com claims it is 99.99% available • Redundancy and multiple availability zones • a = (p – (c X d) ))/p where • a is the expected availability • c the % of likelihood that you will encounter a server loss in a given period • d expected downtime from the loss of the server • p the measurement period • If you have 40% chance of your server failing and it takes 24 hours to fix it, availability is: • (8760 –0.40X24)/8760 = 0.999 or 99.9% • Now consider other points of failures in the system: two cable outage in two hours • (8760 – ((0.4*24)+ (2.0*2)))/8760 = 99.84% • Redundancy mitigates this problem. When you have two or more physical components representing a logical component, the expected downtime of the logical component is the downtime of all the components down simultaneously • c X d now becomes (c X dn )/pn-1 (n-way redundancy) • Applying this formula to a server with a duplicate (n=2) we get 99.99% MCLS Presentation, Ramamurthy
Newer Programming and Application Models • MapReduce programming model • Google File System (GFS), Hadoop File System (HDFS), Collossus, BigTable (stores data in <key,value> pairs) • Dryad from Microsoft • Social networking management frameworks • Many more ideas in the recent Symposium on Cloud Computing (SOCC 2010, USA) • We expect emergence of many more data-intensive computing models MCLS Presentation, Ramamurthy
Accessibility to the Masses • No software to write (at least for the common user); Take a look at the logo of SaleForce.com • Buy cycles using a credit faster than you can go to a supermarket and get bread. • The cloud environment will be usable similar to how gadgets were used • You create a machine image for your commonly used applications and let the user instantiate it: uniformity, standardized use, accountability, compliance can be enforced, knowledge sharing, reusability. MCLS Presentation, Ramamurthy
Disaster Mitigation • Iceland’s volcanic ash cloud illustrates the case for cloud computing: http://cloudcomputing.sys-con.com/node/1377030 • Cloud storage and on-demand services can help in setting up the backup as well as operation capacity to help in mitigating disasters. • After the disaster has moved on, the cloud can be used to recover and restore; once the operation is restored, the cloud capacity is no longer needed and services can be terminated. MCLS Presentation, Ramamurthy
Facilitates Green Computing • The cloud hardware is well located and optimized for efficient operation. • Easy to control ramp up and ramp down the capacity, turn off and on according to green algorithms; • Organization can establish policies and enforce them using green processes such as automatic on/off. • Though on/off is available for individual PC/laptops, power saving is up to the user. MCLS Presentation, Ramamurthy
Limitations • Security of data and application • Privacy of data • Reliability • Other vulnerabilities (crash of a virtual machine vs. crash of a physical machine) : fast fail-over possible • Multi-tenancy (Ex: various organizations renting space on the same cloud provider). • State of flux (frequent updates and addition of features) • Cost computation/pricing MCLS Presentation, Ramamurthy
Certificate Program in Data-intensive Computing at UB • 5 courses: • CS2 (CSE250: Algorithms and Data Structures), • Distributed Systems (CSE486), • Data-intensive Computing (CSE487), • Course in any major area of interest (Ex: biomedical engineering) • Capstone project in the area • A limited number of scholarships ($1000 each) available for enrolling and completing this certificate. • These courses are also offered online by Enginet distance learning program. MCLS Presentation, Ramamurthy
References • Cloud Computing: Characteristics, economies and architectures: http://en.wikipedia.org/wiki/Cloud_Computing • http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/ • Amazon cloud free limited trial: http://aws.amazon.com/free/ • Google App Engine (free limited trial): http://code.google.com/appengine/ • First ACM Symposium on Cloud Computing (SOCC) 2010, Indianapolis Indiana, http://research.microsoft.com/en-us/um/redmond/events/socc2010/ MCLS Presentation, Ramamurthy
Summary • Cloud computing is indeed a transformational technology that is here to stay. • It will be as impactful (if not more) as the Internet, the web and the e-commerce revolution. • Traditional IT will gradually evolve into the cloud infrastructure. • Most of all, it is poised to socialize computing and make it as widely and easily accessible as the other transformational technologies such as telephones and automobiles. MCLS Presentation, Ramamurthy
Demo details MCLS Presentation, Ramamurthy