1 / 28

Large Scale Sky Computing Applications with Nimbus

Large Scale Sky Computing Applications with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France Pierre.Riteau@irisa.fr. Introduction To Sky Computing. IaaS clouds. On demand/elastic model Pay as you go

nam
Download Presentation

Large Scale Sky Computing Applications with Nimbus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France Pierre.Riteau@irisa.fr

  2. Introduction ToSkyComputing

  3. IaaSclouds • On demand/elastic model • Pay as you go • Access to virtual machines with administrator privileges • Portable execution stack • Commercial providers • Amazon EC2 => “infinite”resource pool (e.g. 10K cores) • Scientific clouds => limited number of resources • Science Clouds • FutureGrid Nimbus clouds

  4. Sky Computing • Federation of multiple IaaS clouds • Creates large scale infrastructures • Allows to run software requiring large computational power

  5. SkyComputingBenefits • Single networking context • All-to-all connectivity • Single security context • Trust between all entities • Equivalent to local cluster • Compatible with legacy code

  6. Large-ScaleSkyComputingExperiments

  7. SkyComputingToolkit • Nimbus • Resource management • Contextualization (Context Broker) • ViNe • All-to-allconnectivity • Hadoop • Task distribution • Faulttolerance • Resource dynamicity

  8. Context Broker • Service to configure a complete cluster with different roles • Supports clusters distributed on multiple clouds (e.g. Nimbus and Amazon EC2) • VMs contact the context broker to • Learn their role • Learn about other VMs in the cluster • Ex. : Hadoop master + Hadoop slaves • Hadoop slaves configured to contact the master • Hadoop master configured to know the slaves

  9. Cluster description <workspace> <name>hadoop-slaves</name> <image>fc8-i386-nimbus-blast-cluster-004</image> <quantity>16</quantity> <nicwantlogin="true">public</nic> <ctx> <provides> <identity /> <role>hadoop_slave</role> </provides> <requires> <identity /> <rolename="hadoop_master" hostname="true" pubkey="true" /> </requires> </ctx> </workspace> </cluster> <?xml version="1.0" encoding="UTF-8"?> <cluster xmlns="http://www.globus.org/2008/06/workspace/metadata/logistics"> <workspace> <name>hadoop-master</name> <image>fc8-i386-nimbus-blast-cluster-004</image> <quantity>1</quantity> <nicwantlogin="true">public</nic> <ctx> <provides> <identity /> <role>hadoop_master</role> <role>hadoop_slave</role> </provides> <requires> <identity /> <rolename="hadoop_slave" hostname="true" pubkey="true" /> <rolename="hadoop_master" hostname="true" pubkey="true" /> </requires> </ctx> </workspace>

  10. ViNe • Project of the University of Florida (M. Tsugawa et al.) • High performance virtual network • All-to-allconnectivity ViNe router ViNe router ViNe router

  11. Hadoop • Open-sourceMapReduceimplementation • Heavyindustrial use (Yahoo, Facebook…) • Efficient framework for distribution of tasks • Built-infault-tolerance • Distributed file system (HDFS)

  12. SkyComputing Architecture Distributed Application MapReduceApp Hadoop ViNe IaaS Software IaaS Software

  13. Grid’5000 Overview • Distributed over 9 sites in France • ~1500 nodes, ~5500 CPUs • Study of large scaleparallel/distributedsystems • Features • Highly reconfigurable • Environmentdeployment over bare hardware • Can deploymanydifferent Linux distributions • Evenother OS such as FreeBSD • Controlable • Monitorable (metricsaccess) • Experiments on all layers • network, OS, middleware, applications

  14. Grid’5000 Node Distribution

  15. FutureGrid: a GridTestbed • NSF-fundedexperimentaltestbed • ~5000 cores • 6 sites connected by a private network

  16. Resourcesused in SkyComputingExperiments • 3 FutureGrid sites (US) with Nimbus installations • UCSD (San Diego) • UF (Florida) • UC (Chicago) • Grid’5000 sites (France) • Lille (contains a white-listedgateway to FutureGrid) • Rennes, Sophia, Nancy, etc. • Grid’5000 isfullyisolatedfrom the Internet • One machine white-listed to accessFutureGrid • ViNe queue VR (Virtual Router) for other sites

  17. ViNeDeploymentTopology SD Rennes Grid’5000 firewall Lille UF White-listed Queue VR UC Sophia All-to-allconnectivity!

  18. Experiment scenario • Hadoop sky computing virtual cluster already running in FutureGrid (SD, UF, UC) • Launch BLAST MapReduce job • Start VMs on Grid’5000 resources • With contextualization to join the existing cluster • Automatically extend the Hadoop cluster • Number of nodes increases • TaskTracker nodes (Map/Reduce tasks execution) • DataNode nodes (HDFS storage) • Hadoop starts distributing tasks in Grid’5000 • Job completes faster!

  19. Job progresswith cluster extension

  20. Scalable Virtual Cluster Creation (1/3) • Standard Nimbus propagation: scp Nimbus Repository scp scp scp scp VM1 VM2 VM3 VM4 VMM A VMM B

  21. Scalable Virtual Cluster Creation(2/3) • Pipelined Nimbus propagation: Kastafior/TakTuk Nimbus Repository Propagate Propagate Propagate Propagate VM3 VM1 VM2 VM4 VMM A VMM B

  22. Scalable Virtual Cluster Creation(3/3) • LeverageXenCopy-on-Write (CoW) capabilities Backing file Backing file Backing file Backing file CoW Image 4 CoW Image 1 CoW Image 2 CoW Image 3 Local cache Local cache VMM A VMM B

  23. Conclusion

  24. Conclusion • SkyComputing to create large scaledistributed infrastructures • Our approach relies on • Nimbus for resource management, contextualization and fast cluster instantiation • ViNe for all-to-allconnectivity • Hadoop for dynamic cluster extension • Providesboth infrastructure and application elasticity

  25. Ongoing & Future Works • Elastic MapReduce implementationleveragingSkyComputing infrastructures (presentedat CCA ‘11) • Migration support in Nimbus • Leverage spot instances in Nimbus

  26. Acknowledgments • Tim Freeman, John Bresnahan, Kate Keahey, David LaBissoniere (Argonne/University of Chicago) • Maurício Tsugawa, Andréa Matsunaga, José Fortes (University of Florida) • Thierry Priol, Christine Morin (INRIA)

  27. Thankyou!Questions?

More Related