270 likes | 481 Views
Cloud Computing. Evolution to Cloud computing V irtualization S ecurity Grid vs Cloud Examples. A Super Computer. one BIG machine with 10s x 1000s CPUs shared memory and IO IBM’s Blue Gene. Cluster Computing. Organisation. run jobs.
E N D
Cloud Computing • Evolution to Cloud computing • Virtualization • Security • Grid vs Cloud • Examples
A Super Computer one BIG machine with 10s x 1000s CPUs shared memory and IO IBM’s Blue Gene
Cluster Computing Organisation run jobs run a job inside an organisation on a single cluster e.g. using Message Passing Interface (MPI)
Grid Computing run jobs Multiple Organisations Lots of heterogeneous super computers combined in a VO communicating with open standards
A Super Computer one BIG machine with 10s x 1000s CPUs shared memory and IO IBM’s Blue Gene or lots of little interconnected machines cluster separate memory and IO Google
Cloud Computing run jobs Cloud Lots of virtual machines in a Cloud running in consolidated data centres
Properties of Clouds • pay-per-use • instant availability • scalability (up and down) • hardware abstraction • self-provisioning (self-service) • virtualization • Internet scale • software AND economic paradigm
Service Orientation on a new scale • SaaS software as a service - Google Mail • PaaS platform as a service - Goolge Docs Environment • IaaS infrastructure as a service – Clouds
Grid vs Cloud • The vision is similar - computing on demand. • The target applications/domains have been slightly different. • Grid is typically used in HPC – one app/workflow/algorithm to run across multiple environments. • Cloud is typically lots of small apps running in their own environment. • less so now as Clouds become more widespread • Grid addressed interoperability by building on top of application APIs - e.g. Web Services. Applications need to be bound to that API. • Cloud addresses interoperability by building at the OS or hardware level • Applications run in their own environment
Grid vs Cloud • Grid emerged out of academia. • Cooperation model • Grid is about Virtual Organizations - interconnected sites under different administration. • Cloud emerged out of business. • Compete model • So there is not one Cloud but many. • But inter-cloud comms is on the agenda • CCIF (Cloud Computing Interoperability Forum - not sponsored by Amazon or Google) • OGF (Open Grid Forum). • Traditional Grid consumers (e.g. CERN) are now moving to Cloud • Cloud for HPC not just Web applications • WHY?
Example: Grid Job submission • if you submit a job to a site, a different site may become available first • you may be stuck in a queue for a long time • so people submit skeleton jobs • to all sites • then the skeleton invokes the real job on the site that becomes available first • other skeletons are ditched or kept to run next job in MY queue. • moving computation to data • for BIG data • but don’t trust that the app is on the site or configured to how I need it configured • so deploy the app as well
Issues • lots of schedulers for skeleton jobs • complex • interoperability issues with deploying computation to nodes • does the OS support the binary, version etc. • Cloud can solve both these issues through virtualization • virtual nodes • virtual host environment • dynamic • now you have to manage many VMs
Virtualization benefits • Consolidate systems, workload, and operating environments: • Multiple workloads and operating systems on one physical server • reduces the costs of hardware and workload management. • New software can be tested on hardware they will later use in production mode • without affecting production workload. • Virtual systems can be used as low-cost test systems without jeopardizing production workloads. • Multiple operating system types and releases can run on a single system. • Each virtual system can run the operating system that best matches its application or user requirements.
Virtualization benefits • Optimize resource usage: • Hypervisors can achieve high resource use by dynamically assigning virtual resources (such as processors and memory) to physical resources. • The virtual resources that they provide can exceed the physical system resources in quantity and functionality. • System virtualization enables the dynamic sharing of physical resources and resource pools. • higher resource use, especially for variable workloads whose average needs are much less than an entire dedicated resource. • Different workloads tend to have peak resource use at different times • so implementing multiple workloads in the same physical server can improve system use, price, and performance.
Virtualization benefits • More dynamic and flexible: • Service providers can create one virtual system or clone many virtual systems on demand, achieving dynamic resource provisioning. • Virtual systems are easier to manage than heterogeneous hardware • just managing the VMs
Hypervisors • Virtual environments • expose an abstract view of hardware to guest OS’s • Allows multiple OS on a single machine • Allows homogenous machines to appear heterogeneous and vise versa • Primarily two types: • native • talks directly with the hardware • manages host OS’s • hosted • runs in ordinary OS • manages host OS’s
Type 1 Hypervisor OS OS OS App App App Hypervisor Hardware
Type 2 Hypervisor OS OS OS App App App Hypervisor OS Hardware
Hypervisors • Type 1 • Efficient • Good Resource Control (safety) • Uses hardware assisted virtualization or paravirtualization • hardware assisted virtualization requires help from the CPU architecture • instructions are trapped and hooked into the Hypervisor and executed in software • high CPU overhead • paravirtualization requires changes to the guest OS, i.e. hooks into the Hypervisor API • more efficient • Type 2 • typically installed on client machines or non-dedicated machines
Challenges to Cloud takeup • Security • Most Clouds support low-risk Apps. • Web-applications • HPC computing • Security/compliance is not particularly well supported • These are shared responsibilities between customer, cloud provider, software vendor • 41 percent of companies employ someone to read their workers’ email. • data regulation, e.g. Health Insurance Portability and Accountability Act (HIPAA), or plain old paranoia • Logging, auditing - difficult to do in a virtual environment • Reliability • Feb 2008 Amazon’s S3 and EC2 went down for 3 hours. • Portability • Not one Cloud, but many • vendor lock in • how do I move to another Cloud provider?
Challenges to Clouds takeup • Servers are still somewhere • environmental impact of massive consolidated data centres. • Where a data centre is may have side effects – e.g. U.S. Patriot Act, means your data could be accessed by the U.S. government it happens to be in the US. • Speed • latency of running stuff that could be anywhere • To address some of these, the concept of private Clouds has emerged • get the benefits of virtualization but keep control • Also has challenges • employees want what they know – consolidation may involved downsizing applications • What about all the tech guys who look after the hardware? • hybrid Clouds (public/private) • e.g. to support spikes • can this be the beginning of de factor standards in cloud computing?
Simple Storage Service (S3) • buckets contain objects • 100 buckets per account. named on first come first serve basis • objects - up to 4K of metadata - key/value • REST API uses PUT, POST, DELETE, GET for managing objects and buckets using distinct URLs. • Also has a BitTorrent API • backends are not well known • security - SSL • encryption - do it at home first. • access control - buckets associated with user account • integrity - response contains the MD5 checksum of stored data
Simple Storage Service (S3) • Security • Supports a simple authentication strategy based on the SHA1-HMAC algorithm. • HMAC - Hash-based Message Authentication Code – create a hash from data and secret key • Every account has an Access Key ID and a Secret Access Key. • The Access Key ID is a 20-character string that’s used to uniquely identify your account • the Secret Access Key is a 41-character string that’s used to digitally sign SOAP and REST requests. • To sign a request, compute the HMAC of the request using the Secret Access Key. This HMAC is sent along with the request. • Amazon’s servers, which know your Secret Access Key, compute the same HMAC. If the two HMACs match, then the request is authorized. • Requests include a timestamp to prevent replay attacks.
Simple Storage Service (S3) • The HMAC approach is fast, efficient, and pretty secure. • But the credentials are downloaded from the AWS Web site. • This means that anyone who knows your Amazon username and password can download your Secret Access Key. • And Amazon allows the password to be reset if you can’t remember it, by clicking on a link that’s sent to the account’s registered email address • So anyone who has control of your email system can access/delete your data
Elastic Compute Cloud (EC2) • based on hardware virtualization • uses modified Xen hypervisor • Users install a machine image (VM) • use an off the self VM or roll your own • Amazon Machine Image (AMI) format • apps exist to create these but it is not standardized or well documented. • Images are stored in S3. • A user requests an instantiation - the image is booted and the user gets back a handle to it. They have full access to it and can log in as root. • APIS - REST, SOAP, Java, Firefox plugin
EC2 • EC2 - same security as S3 but uses public/private keys and X509 certificates • configurable firewall • Process: • 1. create a bundle (AMI) • 2. upload it to S3 • The image is encrypted with your private key and signed with your certificate • 3. register it - get back an ID • 4. launch it - using ID - get back an instance ID. • specify your keys in the request • 5. use instance ID to manipulate it - log in, shut down etc.
The bottom line • EC2 Billing is per running instance Hour • or part thereof • typically less than $1 per hour for lots of RAM (8 – 16GB) • S3 billing is for storage and transfer • typically about $0.1 per GB • Data transfer between two instances in the same region are free • three regions: • US West • US East • EU