1 / 37

Virtual Appliances for Training and Education in FutureGrid

Virtual Appliances for Training and Education in FutureGrid. Renato Figueiredo Arjun Prakash, David Wolinsky ACIS Lab - University of Florida. Education and Training. Importance of experimental work in systems research Needs also to be addressed in education Complement to fundamental theory

fynn
Download Presentation

Virtual Appliances for Training and Education in FutureGrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Virtual Appliances for Training and Education in FutureGrid Renato Figueiredo Arjun Prakash, David Wolinsky ACIS Lab - University of Florida

  2. Education and Training Importance of experimental work in systems research • Needs also to be addressed in education • Complement to fundamental theory FutureGrid: a testbed for experimentation and collaboration • Education and training contributions: • Lower barrier to entry – pre-configured environments, zero-configuration technologies • Community/repository of hands-on executable environments: develop once, share and reuse

  3. Goals and Approach A flexible, extensible platform for hands-on, lab-oriented education on FutureGrid Focus on usability – lower entry barrier • Plug and play, open-source • Seamlessly work on local, cloud resources Virtualization + social networking to create educational sandboxes • Virtual “Grid” appliances: self-contained, pre-packaged execution environments • Group VPNs: simple management of virtual clusters by students and educators

  4. Outline • Virtual appliances • GroupVPN • Virtual cluster configuration • Example: MPI appliance

  5. What is an appliance? • Physical appliances • Webster – “an instrument or device designed for a particular use or function”

  6. What is an appliance? • Hardware/software appliances • TV receiver + computer + hard disk + Linux + user interface • Computer + network interfaces + FreeBSD + user interface

  7. What is a virtual appliance? • An appliance that packages software and configuration needed for a particular purpose into a virtual machine “image” • The virtual appliance has no hardware – just software and configuration • The image is a (big) file • It can be instantiated on hardware

  8. Virtual appliance example • Linux + Apache + MySQL + PHP A web server Another Web server LAMP image instantiate Virtualization Layer copy Repeat…

  9. Training/education in clusters • Replace LAMP with the middleware of your choice – e.g. MPI, Hadoop, Condor MPI image An MPI worker Another MPI worker instantiate Virtualization Layer copy Repeat…

  10. What about the network? • Multiple Web servers might be completely independent from each other • MPI nodes are not • Need to communicate and coordinate with each other • Each worker needs an IP address, uses TCP/IP sockets • Cluster middleware stacks assume a collection of machines, typically on a LAN (Local Area Network)

  11. Virtual machines VM image Enter virtual networks “WOWs” • Wide-area • Virtual machines (VMs) • Self-organizing overlay IP tunnels, P2P routing NOWs, COWs • Local-area • Physical machines • Self-organizing switching (e.g. Ethernet spanning tree) Installation image Switched network Physical machines

  12. Virtual cluster appliances • Virtual appliance + virtual network Virtual network MPI + Virtual Network Another MPI worker An MPIworker instantiate Virtual machine copy Repeat…

  13. Unmodified applications Connect(10.10.1.2,80) Capture/tunnel, scalable, resilient, self-configuring routing and object store 10.10.1.1 Isolated, private virtual address space 10.10.1.2 Virtual network architecture Application Virtual Router (Wide-area) Overlay network VNIC Virtual Router Application VNIC

  14. Virtual appliance clusters Virtual appliances • Encapsulate software environment in image • Virtual disk file(s) and virtual hardware configuration The Grid appliance • Encapsulates cluster software environments • Current examples: Condor, MPI, Hadoop • Homogeneous images at each node • Virtual LAN connecting nodes forms a cluster • Deploy within or across domains

  15. Grid appliance internals • Host O/S • Linux • Grid/cloud stack • MPI, Hadoop, Condor, … • Glue logic for zero-configuration • Automatic DHCP address assignment • Multicast DNS (Bonjour, Avahi) resource discovery • Shared data store - Distributed Hash Table • Interaction with VM/cloud

  16. Example – FutureGrid Eucalyptus Nimbus Appliance image Education Training

  17. Example - Archer 2. Create/join 1: Download 1: Download VPN group appliance appliance Download config Free pre Free pre - - packaged packaged Archer Archer Virtual appliances Virtual appliances - - run run on free on free VMMs VMMs ( ( VMware VMware , , VirtualBox VirtualBox , KVM) , KVM) Archer Global Archer Global Virtual Network Virtual Network 3. Boot appliances Automatic connection to group VPN – self-configuring DHCP Middleware: Condor scheduler Condor scheduler NFS file systems NFS file systems CMS, Wiki, YouTube: Community Community - - contributed contributed content: applications, content: applications, Archer seed resources – – datasets, tutorials datasets, tutorials 450 cores, 5 sites

  18. One appliance, multiple hosts • Allow same logical cluster environment to instantiate on a variety of platforms • Local desktop, clusters; FutureGrid; Amazon EC2; Science Clouds… • Avoid dependence on host environment • Make minimum assumptions about VM and provisioning software • Desktop: 1 image, VMware, VirtualBox, KVM • Para-virtualized VMs (e.g. Xen) and cloud stacks – need to deal with idiosyncrasies • Minimum assumptions about networking • Private, NATed Ethernet virtual network interface

  19. Virtual network: GroupVPN • Key techniques: • IP-over-P2P (IPOP) tunneling • GroupVPN Web 2.0/social network interface • Self-configuring • Avoid administrative overhead of typical VPNs • NAT and firewall traversal; DHCP virtual addresses • Scalable and robust • P2P routing deals with node joins and leaves • Networks are isolated • One or more private IP address spaces • Decentralized DHCP serves addresses for each space

  20. node0.ipop 10.10.0.2 node1.ipop 10.10.0.3 node2.ipop Social Network API Alice’s public keys Bob’s public keys Carol’s public key Messaging layer/information system Social network (e.g. XMPP, group site Social Network Web interface GroupVPN Overview Bootstrapping private links through Web 2.0 interfaces and IP-over-P2P overlay tunneling Overlay network (IPOP) Alice Carol Bob

  21. GroupVPN Web interface • Users can request to join a group, or create their own VPN group • E.g. instructor creates a GroupVPN for class • Determines who is allowed to connect to virtual network • Owner can authorize users to join, remove users, authorize other to admin • Actions typical of a certificate authority happen in the back-end without user having to deal with security operations • E.g. sign/revoke a VPN X.509 certificate

  22. Grid/cloud Middleware, apps GroupVPN router 10.10.1.1 “Tap” devices 10.10.1.2 GroupVPN architecture Application Virtual Router GroupVPN overlay VNIC Virtual Router Application VNIC

  23. Under the hood: overlay architecture • Bi-directional structured overlay (Brunet library) • Self-configured NAT traversal • Self-optimized links • Direct, relay • Self-healing structure Direct path Multi-hop path Overlay router Overlay router

  24. Deploying virtual clusters • Same image, per-group VPNs Group VPN Hadoop + Virtual Network Another Hadoop worker A Hadoop worker instantiate Virtual machine copy GroupVPN Credentials Repeat… (from Web site) Virtual IP - DHCP 10.10.1.1 Virtual IP - DHCP 10.10.1.2

  25. Configuration framework • At the end of GroupVPN initialization: • Each node of a private virtual cluster gets a DHCP address on virtual tap interface • A barebones cluster • Additional configuration required depending on middleware • Which node is the Condor negotiator? Hadoop front-end? Which nodes are in the MPI ring? • Key frameworks used: • IP multicast discovery over GroupVPN • Front-end queries for all IPs listening in GroupVPN • Distributed hash table • Advertise (put key,value), discover (get key)

  26. Configuring and deploying groups • Generate virtual floppies • Through GroupVPN Web interface • Deploy appliances image(s) • FutureGrid (Nimbus/Eucalyptus), EC2 • GUI or command line tools • Use APIs to copy virtual floppy to image • Submit jobs; terminate VMs when done

  27. FutureGrid example - Nimbus GroupVPN floppy image • Example using Nimbus: workspace.sh --deploy --mdUserdata /tmp/floppy-worker.zip.b64 --service https://f1r.idp.ufl.futuregrid.org:8443/wsrf/services/WorkspaceFactoryService --file /tmp/output.xml --metadata /tmp/grid-appliance.xml --deploy-mem 1000 --deploy-duration 100 --trash-at-shutdown Trash --exit-state Running --displayname grid-appliance --sshfile /home/renato/.ssh/id_dsa.pub Nimbus service endpoint Metadata – points to image on Nimbus server SSH public key to log in to instance

  28. Summary • Hands-on experience with clusters is essential for education and training • Virtualization, clouds simplify software packaging/configuration • Grid appliance allows users to easily deploy hands-on virtual clusters • FutureGrid provides resources and cloud stacks for educators to easily deploy their own virtual clusters • Goal - towards a community-based marketplace of educational appliances for TeraGrid

  29. Thank you! • More information: • http://www.futuregrid.org • http://grid-appliance.org • This document was developed with support from the National Science Foundation (NSF) under Grant No. 0910812 to Indiana University for "FutureGrid: An Experimental, High-Performance Grid Test-bed." Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF

  30. Local appliance deployments • Two possibilities: • Share our “bootstrap” infrastructure, but run a separate GroupVPN • Simplest to setup • Deploy your own “bootstrap” infrastructure • More work to setup • Especially if across multiple LANs • Potential for faster connectivity

  31. PlanetLab bootstrap • Shared virtual network bootstrap • Runs 24/7 on 100s of machines on the public Internet • Connect machines across multiple domains, behind NATs

  32. PlanetLab bootstrap: approach • Create GroupVPN and GroupAppliance on the Grid appliance Web site • Download configuration floppy • Point users to the interface; allow users you trust into the group • Trusted users can download configuration floppies and boot up appliances

  33. Private bootstrap: General approach • Good choice for single-domain pools • Create GroupVPN and GroupAppliance on the Grid appliance Web site • Deploy a small IPOP/GroupVPN bootstrap P2P pool • Can be on a physical machine, or appliance • Detailed instructions at grid-appliance.org • The remaining steps are the same as for the shared bootstrap

  34. Connecting external resources • GroupVPN can run directly on a physical machine, if desired • Provides a VPN network interface • Useful for example if you already have a local Condor pool • Can “flock” to Archer • Also allows you to install Archer stack directly on a physical machine if you wish

  35. FutureGrid example - Eucalyptus • Example using Eucalyptus (or ec2-run-instances on Amazon EC2): euca-run-instances ami-fd4aa494 -f floppy.zip --instance-type m1.large -k keypair Image ID on Eucalyptus server GroupVPN floppy image SSH public key to log in to instance

  36. Where to go from here? • Tutorials on FutureGrid and Grid appliance Web sites for various middleware stacks • Condor, MPI, Hadoop • A community resource for educational virtual appliances • Success hinges on users effectively getting involved • If you are happy with the system, let others know! • Contribute with your own content – virtual appliance images, tutorials, etc

More Related