440 likes | 461 Views
Eucalyptus is a Linux-based open-source cloud architecture for private and hybrid clouds, emphasizing security, flexibility, and compatibility with popular Cloud APIs. Its hierarchical design and component roles ensure optimal performance and resource utilization.
E N D
Contents • 1.Types of cloud computing tools • 2. Eucalyptus tool • 3. Hadoop tool • 4. KVM tool • 5. Open QRM • 6. Cloudibility tool • 7. Cloudyn tool • 8 Informetica • 9. References
What’s in a name? Elastic Utility Computing Architecture Linking Your Programs To Useful Systems • Eucalyptus is a simple open architecture for implementing cloud functionality. • It is specifically designed to be easy to install and maintain in a research setting, and that it is easy to modify, instrument, and extend. • Eucalyptus can be deployed and executed without modification to the underlying infrastructure. • Eucalyptus components have well defined interfaces support secure communication (using WS-Security policies), and rely upon industry-standard Web-services software packages (Axis2, Apache, and Rampart).
Eucalyptus is a Linux-based open source software architecture that implements efficiency-enhancing private and hybrid clouds within an enterprise’s existing IT Infrastructure. • A Eucalyptus private cloud is deployed across an enterprise’s “on-premise” data center infrastructure and is accessed by users over enterprise intranet. Thus sensitive data remains Entirely secure from external intrusion behind the enterprise firewall.
Why Eucalyptus • Open Source you can download it and have the source code at your fingertips. • Modular The Eucalyptus components have well-defined interfaces (via WSDL, since they are web services) and thus can be easily swapped out for custom components. • Distributed Eucalyptus allows its components to be installed strategically close to the needed/used resources. For example Walrus can be installed close to the storage, while the Cluster Controller can be installed close to the cluster it will manage. • Designed to Perform Eucalyptus was designed from the ground up to be scalable and to achieve optimal performance in diverse environments (designed to overlay an existing infrastructure).
Why Eucalyptus • Flexible Eucalyptus is flexible and can be installed on a very minimal setup. Yet it can be installed on thousands of cores and terabytes of storage. And it can do so as an overlay on top of an existing infrastructure. • Compatible Eucalyptus is compatible with the most popular and widely used Cloud API currently available: Amazon EC2 and S3. • Hypervisor Agnostic Currently Eucalyptus fully supports KVM and Xen. Additionally, the Enterprise Edition supports the proprietary VMware hypervisor.
Hierarchical Design Eucalyptus employs a hierarchical design to reflect underlying resource topologies
Eucalyptus Components • Cloud controller (CLC) • Cluster controller • Node controller • Storage controller • VMBroker (optional)
Cloud Controller (CLC) The Cloud Controller (CLC) is the entry-point into the cloud for administrators, developers, project managers, and end-users. Functions: • Monitor the availability of resources on various components of the cloud infrastructure, including hypervisor nodes that are used to actually provision the instances and the cluster controllers that manage the hypervisor nodes • Resource arbitration { Deciding which clusters will be used for provisioning the instances } • Monitoring the running instances
Cluster Controller(CC) The Cluster Controller (CC) generally executes on a cluster front-‐end machine, or any machine that has network • Connectivity to both the nodes running NCs and to the machine running the CLC. CCs gather information about a set of VMs and schedules VM execution on specific NCs. The CC also manages the virtual instance network and participates in the enforcement of All nodes served by a single CC must be in the same broadcast domain (Ethernet). Functions: • To receive requests from CLC to deploy instances • To decide which NCs to use for deploying the instances on • To control the virtual network available to the instances • To collect information about the NCs registered with it and report it to the CLC
Node Controller (NC) • The Node Controller (NC) is executed on every node that is designated for hosting VM instances. • NCs control the execution, inspection, and termination of VM instances on the host where it runs, fetches and cleans up local copies of instance images (the kernel, the root file system, and the ramdisk image), and queries and controls the system software on its node (host OS and the hypervisor) in response to queries and control requests from the cluster controller. The Node controller is also responsible for the management of the virtual network endpoint. Functions: • Collection of data related to the resource availability and utilization • on the node and reporting the data to CC • Instance life cycle management
Storage Controller • The Storage Controller (SC) provides functionality similar to the Amazon Elastic Block Store (Amazon EBS). The SC is capable of interfacing with various storage systems (NFS, iSCSI, SAN devices, etc.). • Elastic block storage exports storage volumes that can be attached by a VM and mounted or accessed as a raw block device
Benefits of Eucalyptus • The Eucalyptus open source private cloud gives IT organizations the features so essential to improving the efficiency of an IT infrastructure, including the following: • Data center optimization. Eucalyptus optimizes existing data center resources with consolidation through virtualization of all data center elements, including machines, storage and network. Eucalyptus is compatible with most widely used virtualization technologies, including Xen and KVM hypervisors. • Automated self-service. Eucalyptus automates computer resource provisioning By allowing users to access their own flexible configurations of machines,storage, and networking devices as needed through a convenient self-service Web interface. • Customizable Web-interface. Eucalyptus uses universally accepted Web- based network communication protocols that allow users to access computing resources through a highly customizable Web-interface.
Overview Hadoop Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster.
In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.
HDFS Hadoop's Distributed File System is designed to reliably store very large files across machines in a large cluster. It is inspired by the Google File System. Hadoop DFS stores each file as a sequence of blocks, all blocks in a file except the last block are the same size. Blocks belonging to a file are replicated for fault tolerance. The block size and replication factor are configurable per file. Files in HDFS are "write once" and have strictly one writer at any time.
Hadoop Distributed File System – Goals: • Store large data sets • Cope with hardware failure • Emphasize streaming data access
Map Reduce The Hadoop Map/Reduce framework harnesses a cluster of machines and executes user defined Map/Reduce jobs across the nodes in the cluster. A Map/Reduce computation has two phases, a map phase and a reduce phase. The input to the computation is a data set of key/value pairs. Tasks in each phase are executed in a fault-tolerant manner, if node(s) fail in the middle of a computation the tasks assigned to them are re-distributed among the remaining nodes. Having many map and reduce tasks enables good load balancing and allows failed tasks to be re-run with small runtime overhead.
Hadoop Map/Reduce – Goals: • Process large data sets • Cope with hardware failure • High throughput
KVM virtualization and management • Kernel-based Virtual Machine (KVM) is a free, open source virtualization architecture for Linux distributions. • KVM virtualization is often compared with Xen, which is the open source hypervisor for Oracle VM, Citrix Systems Inc.'s XenServer and other platforms. But KVM virtualization, which is supported by Red Hat Inc. and Canonical Ltd., uses a type-two hypervisor that resides within the Linux kernel.
differences between Xen and KVM virtualization. It also covers KVM management tools and how to set up a KVM virtualization environment.
The differences between Xen and KVM virtualization • Xen and KVM are free Linux virtualization hypervisors, but their approach to open source virtualization is vastly different. Xen is a type-one, bare-metal hypervisor that's more mature than KVM. It includes several built-in management tools and supports numerous host and guest environments, as well as hardware architectures.
Conversely, the KVM virtualization architecture is relatively new. Because KVM is embedded in the Linux kernel, KVM proponents claim it's easier to manage virtual machines (VMs) and Linux updates.
KVM management tools • There are two KVM management tools: virsh and virt-manager. From the command line,virsh can streamline KVM management. Because it's a master command with numerous subcommands, the learning curve is steep. • Virt-manager, on the other hand, is a graphical user interface that simplifies themanagement of virtual machines in Red Hat Enterprise Linux. Virsh has a larger feature set, but virt-manager's point-and-click interface can perform most administrative tasks.
OPEN QRM • openQRM is... • a web-basedopen sourcedatacenter managementandcloud computingplatform that integrates flexibly with your existing datacenter components. openQRM supports the major virtualization technologies KVM, XEN, Citrix XenServer , VMWare , lxc and OpenVZ. openQRM automates provisioning, virtualization, storage and configuration management, and it takes care of high-availability.
OpenQRM For Admins Admins run the operative heart of every modern organisation - its datacenters and IT services. Many of them have come the painful "manual" way of maintaining IT infrastructure. openQRM helps to get over the daily time killers in system administration by automating the whole system lifecycle - from self-service request to automatic server and storage provisioning, application deployment, monitoring, high-availability and billing.
OpenQRM is lean, flexible and extensible • OpenQRM is often called the 'admin's swiss army knife'. It's made for headache-free datacenter management. OpenQRM consists of a minimal base system that can be extended using any combination of the over 50 plugins it comes with. Or you just write your own plugin.
OpenQRM for Users • Users create innovation and work on new products. To do so efficiently, they collaborate using numerous servers and software applications. Users are pretty good at haunting the admins for more resources - and they don't want to wait for long procurement processes, for sure. • Stop chasing after IT resources. Serve yourself in seconds instead of weeks - openQRM has a user-friendly Cloud Portal.
OpenQRM for Managers • Managers need up-to-date information on the performance of the company's IT infrastructure. They care about keeping track of every task and they make sure that system capacity, power consumption and financial processes are always aligned with the business goals. • openQRM provides automatic statistics and graphs, monitoring and SLA reporting, billing and much more to save precious time.
Cloudibility • Cloudability’s new Reserved Instances Explorer helps companies keep track of and best utilize their discounted Amazon reserved instances even across accounts. The tool can search Amazon EC2 instances by size, region, operating system and expiration date.
Many companies use Amazon EC2 discounted reserved instances as a way to save money on their cloud computing loads. But keeping track of what instances they have reserved and which are already in use can be tricky. Cloudability is attacking that problem with a new tool that helps users keep tabs on those reserved instances to make sure they use them most efficiently.
Cloudyn • These tools are designed to help corporate IT from over-buying Amazon cloud resources. Cloudyn’s suite of services gives users a dashboard showing detailed information on all of their virtual machine instances, databases, and storage. Cloudyn then provides insights into inefficiencies and suggestions on how to get rid of them. Currently focused on Amazon’s cloud, the company will expand to include Rackspace, Microsoft Azure and GoGrid in the future.
Which type of Amazon Web Services Reserved Instance should your company deploy and for what time period for any given job? These are the kinds of questions Cloudyn’s new calculator can help you sort out for yourself, the company said.
That’s where a tool like Cloudyn’s new Reserved Instance calculator might come in handy. It will let existing or potential Amazon customers run “what if” scenarios to determine which types of Reserved Instances to buy. This type of instance, booked in 1- or 3- year time periods, can be less expensive than Amazon’s on-demand instances, but the calculation is not all that straightforward given the variables — reserved instances come in configurations for light, medium and heavy utilization, for example.
“Reserved instance types vary in term so the initial one-time fee and the later hourly rate,” said Sharon Wagner, CEO of the startup, which is based in Raanana, Israel. A lot of these factors need to be taken into account — something the calculator can do, he said.
Informatica • Informatica this month will roll out its next generation suite of cloud integration tools that include enhancements that address data security issues in the cloud and help enterprise IT manage data integration issues in hybrid cloud deployments. The new Informatica Cloud Data Masking service reduces the risk of data breaches during application development and testing. And the new Informatica Cloud Extend workflow service is geared toward business process creation and management in the cloud.
One of the biggest technical obstacles in the world of cloud computing is integrating cloud applications with each other and with on-premise systems. Data integration software developer Informatica released a package of software tools the company said can help businesses overcome those hurdles.
For Informatica's channel partners and systems integration allies, the new Cloud 9 toolset offers a means of building customized data integration software for customers and assembling data integration links that can be reused in multiple deployments, said Darren Cunningham, senior marketing director for Informatica's on-demand products.
References • http://s3.amazonaws.com/ppt-download/cloud2-12975033983357-phpapp02.pdf?response-content-disposition=attachment&Signature=7SqnTJjCxsZmc0Ypd7RdUXcInTg%3D&Expires=1387270532&AWSAccessKeyId=AKIAIW74DRRRQSO4NIKA • http://www.cse.buffalo.edu/~bina/CloudComputingJun28.ppt • http://ntcap.nic.in/eucalyptus-class.ppt • http://www.cs.tau.ac.il/system/faq/services/files/files-1/Hadoop.ppt