1 / 20

Lecture 5: Build-A-Cloud

http://www.cs.columbia.edu/~sambits/. Lecture 5: Build-A-Cloud. Life Cycle in a Cloud. Build a image(s) for the software/application that we want to host on cloud (lecture 4) Request a VM – pass appropriate parameters such as resource needs and image details (lecture 3)

Download Presentation

Lecture 5: Build-A-Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. http://www.cs.columbia.edu/~sambits/ Lecture 5: Build-A-Cloud

  2. Life Cycle in a Cloud • Build a image(s) for the software/application that we want to host on cloud (lecture 4) • Request a VM – pass appropriate parameters such as resource needs and image details (lecture 3) • When the VM is started up, parameters are passed to it at appropriate run levels to auto-configure the software image (lecture 4) • Now in this lecture • Lets monitor the provisioned VM • Manage it at run time • As workload changes, make changes to the amount of requested resource

  3. What we shall learn • We shall put together a cloud piece by piece • Open Nebula as the cluster manager • KVM as the hypervisor for host machines • Creating and managing guest VMs • Creating Cluster Application(s) using VMs • Application level management • Interesting Sub-topics which we will touch • Monitoring cluster and applications in such an environment • Example application level management • How to add on-demand resource scaling using Open Nebula and Ganglia

  4. Cloud Setup • Basic Management • Image Management • VM Monitoring & Management • Host Monitoring & Management private cloud client Image Management VM Management VN Management Host Management Management Layer Infrastructure Info

  5. Our stack for the cloud • Open Nebula – for managing a set of host machines that have hypervisor on them • KVM – hypervisor on the host machines • Ganglia – for monitoring the guest VMs • Glue code for implementing Application management: e.g. resource scaling

  6. OpenNebula Setup • Install OpenNebula management node • Download and compile the src on the mgmt-node (easy installation, install root as oneadmin) • Setup sshd on all hosts which have to be added (also install ruby on them) • Allow root of the mgmt-node to have password-less access to all the managed hosts • Setup image repository (shared FS based setup is required for live migration) • If you do not have linux-server (download VirtualBox) and create a linux VM on your laptop • Open Nebula Architecture • Tools written om top of OpenNebula interact with core via XML-RPC • The core exposes VM, Host, Network management APIs • Core stores all installation and monitoring information in SQLite3 (or MySQL) DB. • Most of the DB information can be accessed using XML-RPC calls • All the drivers are written in ruby as run as daemons, which in-turn call small shell-scripts to get the work-done.

  7. Create a Cloud • Start the one daemon • Edit $ONEHOME/etc/oned.conf for necessary changes (quite intuitive) • Put login:passwd in $ONEHOME/etc/one_auth • “one start” does that • Keeps all the DB and logs in $ONEHOME/var/ • NOTE: if you want to do a fresh setup, simply stop oned and delete $ONEHOME/var/ and again start the OpenNebula daemon • Setup ssh on host machines (allow oneadmin as password-less entry) • Concatenate the .ssh/id_rsa of admin-node on the host-server’s .ssh/authorized_keys • chmod 600.ssh/authorized_keys • Add hosts to OpenNebula • Use command onehost • Command is written in Ruby • Command basically makes XMLRPC call to the OpenNebula server’s HostAllocate call • E.g.

  8. Configure network • Fixed: defines fixed set of IP-MAC pairs • Ranged: defines a class network • e.g. fixed set network setting (assuming you have a set of static IP addresses allotted to you then how will you set it up). Note: good site for help:http://www.opennebula.org/documentation:rel1.4:vgg

  9. How to access OpenNebula • All API can be called using XML-RPC client libraries • Nebula command line client (Ruby) • Java Client

  10. Setup Monitoring • Requirements of Monitoring • Need something which stores resource monitoring data as a time series • Exposes interfaces for querying it and simple aggregation of data • Automatically archives the older data • How to achieve it? • Install Ganglia ! • Tune the VM-images to automatically report their monitoring via ganglia • Install gmond on host-servers • What is Ganglia • Its an open-source S/W (BSD License) • Distributed monitoring of clusters and grids • Stores time-series data and historical data as archives (RRDs) • How to get Ganglia • Download the source-code from (http://ganglia.info/downloads.php) • For some Linux distributions, RPMs are available

  11. Components of ganglia • It has two prime daemons • Gmond: a multi-threaded daemon, which runs on monitored nodes • Collects data on monitored notes and broadcasts the monitored data as XML (can be accessed at port 8649) • Configuration script (/etc/gmond.conf) • Gmetad: • periodically polls a collection of children data-sources • parses the collected XML and saves all numeric metrics to round-robin databases • exports the aggregated XML over a TCP socket to clients (8651) • Configuration file /etc/gmetad.conf • One for each cluster • Round Robin Database • RRDtool is a well known tool for creating and storing and retrieving/plotting RRD data • Maintains data at various granularities: e.g. defaults are: • 1-hour data averaged over 15-sec (rra[0]) • 1-day data averaged over 6-min (rra[1]) • 1-week-data averaged over 42-min (rra[2]) • The web GUI tools • These are a collection of PHP scripts started by the Webserver to extract the ganglia data and generate the graphs for the website • Additional tools • gmetric to add extra stats - in fact anything you like numbers or strings, with units etc. • gstat to get at the Ganglia data to do anything else you like Note: good site for help: http://www.ibm.com/developerworks/wikis/display/WikiPtype/ganglia

  12. How to get monitoring-data? • How to get the time-series data? • Ganglia stores all RRDs in “/var/lib/ganglia/rrds/cluster_name/machine_ip” • There is a rrd file for each metric • Data is collected at a fixed time-interval (default is 15 sec ) • One ca retrieve the complete time series of monitored data using rrdtool from each rrd file: e.g.: • Get average load_one for every 15 sec of the the last 1-hour: • rrdtool fetch load_one.rrd AVERAGE -end now -start e-1h -r 15

  13. How to get monitoring-data? … • How to access this data from inside a program • Either use sshlib (for perl, python or Java) and remotely execute the rrdtool command with correct parameters • Write a small XML-RPC server which exposes a function to run rrdtool fetch queries. • E.g. perl XML RPC server • use Frontier::Daemon; • my $d = Frontier::Daemon->new( • methods => { • sum => \&sum,}, • LocalAddr => $server_ip, • LocalPort => $server_port, • debug => 1, • ); • sub sum { • my ($auth, $arg1, $arg2) = @_; • my $bool = 1; • my ($package, $filename, $line, $subroutine, $hasargs, $wantarray, $evaltext, $is_require, $hints, $bitmask) = caller(0); • $log->info("$subroutine - " . $_[1]); • $log->debug("$subroutine @_"); • $log->debug("$subroutine - " . join(";",@_)); • return {SUCCESS=>$bool, MESSAGE=>$arg1 + $arg2}; • }

  14. Create a Multi-tiered Clustered Application • Lets us consider a two-tired TPC-W (web server and database performance benchmark) • How to create an application on custom-images • Create a 6-GB file using dd (utility for converting and copying files) • Attach a loop-back device to it • Format it like a file-system (say ext3) • Partition it into 3(swap, boot and root) • Install complete OS and application stack on the relevant partitions. • Install gmond and configure it. • Save it as a custom-image • For TPC-W one will need, • apache tomcat-server, • java-implementation of TPC-W • MySQL Server. • We will need a load-balancer, which can route http-packets to various backend-servers (and also http-session aware) • I am using HAProxy (easy to install and configure) • Nginx, lighttpd are also other popular http-proxy servers.

  15. Installing a multi-tier application Client • Install a two-tiered Application • Create a template of load-balancer • Create a template of TPCW • Deploy the LB-VM (using OpenNebula) • Deploy the TPCW-VM (using OpenNebula) • Attach TPCW application VMs to LB-VM • Test using Web-browser if setup is working • Create a Client-template • Deploy the client VM • Test client LoadBalancer TPCW-0 TPCW-1

  16. Application Level Operation • One needs to maintain Application level information, for e.g. which VM is a load-balancer and which VMs are backend servers). • Keep Application level knowledge in some local database. • Application Level Operation: e.g. Dynamic provisioning • Case 1: increasing capacity using replication • Monitor the average utilization of VMs over say 1-min (using ganglia) • If the average utilization of all the VMs under the load-balancer is above say 70% • provision a new VM using OpenNebula (reactive provisioning also supported by EC2) • Run the post-install script to add the new VM to the application • Case 2: increase capacity using migration/resizing • Monitor the average utilization of VMs over say 1-min (using ganglia) • If only one-vm is over-utilized and the host does not have more resources • migrate it to another host and re-size it to higher capacity (note nebula does not support it) • Migrate-and-resize VM • Migrate the image to another host • Change the VM-configuration file to new configuration • Start the VM with new configuration file (with more RAM and CPU)

  17. Application Level Operations (e.g. Dynamic Provisioning) … • Where and How to implement the application-scaling logic: • Application scaling logic needs knowledge of application topology • It obviously resides above Infrastructure management layer (I.e. OpenNebula) • Choose an easy to build language (Perl, Python, Ruby, Java etc). • XML-RPC client is required to make access to OpenNebula • Write a management program using language of your choice which • Installs a multi-tier Application and stores application topology in local DB • Periodically monitors • average load on each server • Proxy errors • Implement case-1 and case-2 • Post-install script is adding the VM to the load-balancer and restarting it. • Problem: live-resize or migrate-and-resize are not present in OpenNebula • Hack: create a script which does the following (very dirty but it works) • Migrate the current VM to destination host • Alter the configuration file of this migrate VM • Destroy and recreate the VM. • Neater solution • Add a class in include/RequestManager.h (say VirtualMachineResize similar to that of class VirtualMachineMigrate) • Add another method in src/rm/RequestManager.cc (say: migrateResize) • Implement the class in src/rm/RequestManagerResize.cc (implement the resize).

  18. Solution Architecture • Application Manager (written above OpenNebula): The high level control flow is: • Periodically monitor the workload-change and Application performance • Manage the current configuration and actuate configuration change • Calculate the changed capacity (using some model and feedback from monitoring block) • Find the new configuration of application • Go ahead and start the process of actuating the new change

  19. How to use (demo!) • Command line scripts • VM Lifecycle steps • Creation: show template and image-naming • Suspension: just the command • Migration: migration (suspend and migrate) • Deletion: removing the image • Show ganglia monitoring • Host monitoring through VM-lifecycle • VM monitoring

  20. Cloud Management using this Setup • Integrate Nebula monitoring with ganglia and make it more efficient • Use monitoring for VM placement on hosts. • Use monitoring to do reactive provisioning

More Related