260 likes | 413 Views
The Cloud Cost Model and Non-Functional A ttributes. Reference: Cloud Application Architectures, G. Reese, O’Reilly. Topics. Users and public want to know: Cost of using the cloud Is it reliable? Is it dependable? How about availability? 24 X7? How secure is the cloud?
E N D
The Cloud Cost Model and Non-Functional Attributes Reference: Cloud Application Architectures, G. Reese, O’Reilly
Topics • Users and public want to know: • Cost of using the cloud • Is it reliable? • Is it dependable? • How about availability? 24 X7? • How secure is the cloud? • How private is the cloud?
Non-functional Attributes • We often focus on the functional requirements and come to address the non-functional attributes last. • On the other hand, users are very concerned about non-functional attributes, esp. with respect to the cloud.
Dependability • According IFIP (International Federation of Information Processing) • Dependability thus includes as special cases such attributes as reliability, availability, safety, security. • According IFIP WG on Dependable Computing and Fault Tolerance: • “the trustworthiness of a computing system which allows reliance to be justifiably placed on the service it delivers” • 20 years ago I was taught dependability = reliability + availability
Reliability • An example often used to illustrate the difference between reliability and validity in the experimental sciences involves a common bathroom scale. • If someone who is 200 pounds steps on a scale 10 times and gets readings of 15, 250, 95, 140, etc., the scale is not reliable. If the scale consistently reads "150", then it is reliable, but not valid. • If it reads "200" each time, then the measurement is both reliable and valid. (ref: wikipedia) • Of course, you are tolerant and may allow for a margin of error. (+ or – 5 lbs?)
Reliability • How well can you trust the system to protect data integrity and execute the requested operations. • Example 1: The reliability of this teaching station • Example 2: I was preparing for this lecture, and my laptop hard drive failed deleting my presentation: (reliability of my laptop) • Data corruption is another reliability problem. • Reliability on the cloud: • What if your instance goes down? Don’t store anything in the instance store. • Store your data in EBS and snapshot it frequently. • Use formal checkpoint and recovery: periodically save the state to which you can rollback in case of failure
Availability • a = (p – (c X d) ))/p where • a is the expected availability • c the % of likelihood that you will encounter a server loss in a given period • d expected downtime from the loss of the server • p the measurement period • 365 * 24 = 8760 hours in a year • If you have 40% chance of your server failing and it takes 24 hours to fix it, availability is: • (8760 –0.40X24)/8760 = 0.999 or 99.9%
Availability (contd.) • Now consider other points of failures in the system: two cable outage in two hours • (8760 – ((0.4*24)+ (2.0*2)))/8760 = 99.84% • Redundancy mitigates this problem. When you have two or more physical components representing a logical component, the expected downtime of the logical component is the downtime of all the components down simultaneously • c X d now becomes (c X dn )/pn-1 • Applying this formula to a server with a duplicate we get 99.99%
Availability in Amazon AWS • Amazon aws provides SLA for S3 and Ec2. • Other companies such as GoGrid and RackSpace are better. Their delivery models are different from AWS. • Study the availability computation for a typical scenario.
Availability Zones and Regions • Very nice discussion available at: http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/index.html?using-regions-availability-zones.html • Regions: useast (viginia), uswest (california), asia (singapore), asia pacific (tokyo), europe( ireland ) • Regions have availability zones • Communication within a availability is fine • But between availability zone sis expensive • Between zones is even more expensive • You use availability zones and regions for backup of your deployments in a given zone. • See two interesting articles: • Amazon’s explanation of US East Failure (http://aws.amazon.com/message/65648/) • Netflix’s statement on how they survived it (http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.html)
AWS Account Activity $2498.42
Account Activity Expanded 50.15 590 Hrs 96.66
Software Licenses • Cloud environments come with costs bundled with the instances for the common operating systems and software packages: • Example: Windows, MySQL, Linux versions.. • Cloud is an impetus to work with freeware and open source. • Open source is ideal for the cloud • Flexibility of the open source made amazon cloud possible • Beyond, open source best licensing model is the one charges by CPU-hour • Amazon has recently introduced a feature where you can compute with licenses you purchased; bring your own license (BYOL) • Restricted software licenses are not good for cloud environment: • Per user licensing that requires validation against a server, auditing and such • Lesson: Make sure you understand the licensing for the products you use
Simple Cost Model • 0.10 c per hour if you leave the Linux instance on: after 10 hours you terminate it, you pay $1.00 • http://aws.amazon.com/ec2/pricing/ • See also http://calculator.s3.amazonaws.com/calc5.html • From the reference book: • 0.10/CPU-hour : one load balancer • 0.40/CPU-hour: 2 application servers • 0.80/CPU-hour: 2 database servers $2.40+ 44.00+38.40 = 84.80 per day for a typical scenario leading to $30,952 per year. This + software licenses (if you use yours) + management tools (cloud monitoring) + labor (who prepares and loads stuff on the cloud).
A Sample Cloud ROI Analysis • On-demand instances let you pay for compute capacity by the hour with no long-term commitments. • This frees you from costs and complexities of planning, purchasing, and maintaining hardware and transforms what are commonly large fixed costs into much smaller variable cost.
ROI contd. • Reserved Instances give you the option to make a one time payment for each instance you want to reserve and in turn receive a significant discount on the hourly usage charge for that instance. • Spot instances enable you to bid for unused Amazon Ec2 capacity. Instances are charged the Spot Price which is set by Amazon Ec2 and fluctuates periodically depending on the supply of and demand for Spot Instance capacity.
Scenario 1 • Traditional: • Half rack at a reliable ISP with sufficient bandwidth to support your needs • Two good firewalls • One hardware load balancer • Two good GB Ethernet switches • Six solid, commodity business servers • The cloud option: • One medium 32-bit instance • Four large 64-bit during standard usage to meet peek demands • Assume open source software and services • Costs for time for setting up environments, monitoring services, labor for management of environment. • Table next gives the upfront and ongoing costs.
Cost Analysis • Costs associated with different infrastructures (I – initial, M-Monthly) Internal-I Cloud-I Internal-M Cloud-M Rack $3,000 $0 $500 $0 Switches $2,000 $0 $0 $0 Load balancer $20,000 $0 $0 $73 Servers $24,000 $0 $0 $1,206 Firewalls $3,000 $0 $0 $0 24/7 Support $0 $0 $0$400 Mgt. software $0 $0 $100 $730 Expected labor $1,200 $1,200 $1,200 $600 Degraded.PERF $0 $0 $100 $0 Totals $53,200 $1,200 $1,900 $3009
Cost Comparison • 112,083 (internal) vs 94,452 (cloud) • When the traffic patterns are static and steady you may not need the cloud • Cost savings are tremendous when the variance between peak and average increases, and between average and low increases. • Excellent case: POP!World
Service Levels for Cloud Applications • Cloud companies provide customers a services level agreement (SLA) that identifies key metrics (service levels) • The ability to understand and to fully trust the availability, reliability, and performance of the cloud is key conceptual block before moving into the cloud.
Cost model of the cloud • An important observation for the cloud users is that the cloud transfers IT cost from a capital investment to an operational cost. • That is, instead of company designing elaborate plans and buying all the IT hardware and software licenses upfront incurring capital cost that will stay there until the company exists, it will lease or buy on-demand cloud resources and will pay only for what it uses (like water, electricity and other utilities) • Which do you like for a small start up?
Google App Engine Pricing • http://www.google.com/enterprise/cloud/appengine/pricing.html • In summary for $500 per month/account you pretty much have access to all the resources. • Fine example of operational cost: just like your subscription to netflix!
Performance • Design your application so logic is spread across multiple servers • Use multi-threading to exploit multi-core in the underlying architecture • Clustering versus independent servers; a load balancer working with a set of independent nodes is better. • Mind your storage when considering performance: instance-store is unpredictable, EBS is fine, S3 is slow; however EBS is what failed;
Security • Security issues: • Legal implications, regulatory constraints, standards, and compliance issues are different in the cloud • No perimeter security: you secure the traffic not the infrastructure • Cloud storage assumes high risk for exploits(unproven) • Virtualization solutions may have their own vulnerabilities.
Disaster Recovery • Is the art of being able to resume normal systems operations when faced with a disaster scenario. • Cloud is an ideal solution for disaster recovery plans.
Summary • EC2 instances are much less stable than physical servers • The multiplicity of availability zones can mitigate lack of stability in an EC2 instance • Best way to improve infrastructure is to have spare parts lying around. In this respect cloud can help. How?