1 / 32

Is the Cloud the Panacea for Process Efficiency?  The Elastic-R Case Study

Is the Cloud the Panacea for Process Efficiency?  The Elastic-R Case Study . Karim Chine karim.chine@cloudera.co.uk. Efficiency Killers, a Selective Catalog (Scientific Computing Perspective). Problem I : Scientific Computing Environments Fragmentation. www.sagemath.org.

lloyd
Download Presentation

Is the Cloud the Panacea for Process Efficiency?  The Elastic-R Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Is the Cloud the Panacea for Process Efficiency?  The Elastic-R Case Study Karim Chine karim.chine@cloudera.co.uk

  2. Efficiency Killers, a Selective Catalog (Scientific Computing Perspective)

  3. Problem I : ScientificComputingEnvironments Fragmentation www.sagemath.org www.wolfram.com http://www.r-project.org http://root.cern.ch www.bioconductor.org office.microsoft.com www.mathworks.com www.scilab.org www.minitab.com www.jmp.com www.scipy.org www.sas.com accelrys.com www.taverna.org.uk www.spss.com www.perl.org

  4. Problem II : Hardware, OS and Applications Fragmentation Version 2.5.0 Version 2.9.1 Version 2.6 Version 2.11.0 Version 2.10.0 Version 2.4.0 Version 2.1 Version 2.6.0

  5. Problem III : Data Fragmentation / Inconsistency / Lack of Traceability

  6. Problem IV : Ad hoc Scientific Applications Life Cycle

  7. Problem V : Ad hoc Web Services Life Cycle Management

  8. Problem VI : Poor IT / Software Usability "Give me a place to stand, and I shall move the earth with a lever"

  9. Cloud Computing and the Building Blocks of Convergence

  10. Technological Convergence Virtualization Technologies Java Web Services Rest/SOAP Infrastructure-as-a-Service WS APIs Html 5

  11. , lingua franca of data analysis From: John Fox, Aspects of the Social Organization and Trajectory of the R Project, R Journal-Feb 2009

  12. Elastic-R is a ubiquitous plug-and-play platform for scientific and statistical computing Computational Components R packages : CRAN, Bioconductor, WrappedC,C++,Fortran code Scilab modules, MatlabToolkits, etc. Open source or commercial Computational User Interfaces Workbench within the browser Built-in views / Plugins / Spreadsheets Collaborative views Open source or commercial ComputationalResources Hardware & OS agnosticcomputingengine : R, Scilab,.. Clusters, grids, private or public clouds free: academicgridsor pay-per-use: EC2, Azure Computational Data Storage Local, NFS, FTP, Amazon S3, Amazon EBS free or commercial Computational Scripts R / Python / Groovy On client side: interactivity.. On server side: data transfer .. Computational Application Programming Interfaces Java / SOAP / REST, Stateless and stateful Generated Computational Web Services Stateful or stateless, automatic mapping of R data objects and functions

  13. Elastic-R portal: Access as-a-Service to Scientific Computing Environments running on centralized and standardized virtual appliances Public Clouds Private Cloud

  14. Elastic-R on Infrastructure-as-a-Service style Cloud

  15. Anatomy of an Elastic-R machine instance on Amazon EC2 Restful WS over SSL Restful WS over SSL SOAP over SSL Heartbeat Restful WS over SSL SSH HTTPS

  16. Software+Services=Applications convergence The server-side toolkit: R + spreadsheet models + virtual gui widgets.

  17. Demo

  18. Cloud Computing and the Building Blocks of Ubiquitous Collaboration

  19. Elastic-R is a collaborative Virtual Research Environment. Users can share their machine instances, stateful remote engines, data,..

  20. The Elastic-R portal itself is an EC2 machine instance. Any number of portals can be run on EC2 for decentralized and private collaboration Amazon Virtual Private Cloud Subnet 2 Subnet 1 Subnet 3

  21. Software+Services = Ubiquitous Collaboration.

  22. Demo

  23. Cloud Computing and the Building Blocks of Reproducible Research

  24. A scientist can snapshot her computational environment and her data. She can archive the snapshot or share it with others. Elastic-R Amazon Machine Images Elastic-R AMI 1 R 2.10 + BioC 2.5 Elastic-R AMI 2 R 2.9 + BioC 2..3 Elastic-R AMI 2 R 2.9 + BioC 2.3 Elastic-R AMI 3 R 2.8+BioC 2.0 Elastic-R EBS 4 Data Set VVV Amazon Elastic Block Stores Elastic-R.org Elastic-R AMI 2 R 2.9 + BioC 2.3 Elastic-R EBS 4 Data Set VVV Elastic-R EBS1 Data Set XXX Elastic-R EBS 2 Data Set YYY Elastic-R EBS 3 Data Set ZZZ Elastic-R EBS 4 Data Set VVV

  25. A scientist can snapshot her computational environment and her data. She can archive the snapshot or share it with others. Elastic-R Amazon Machine Images Elastic-R AMI 1 R 2.10 + BioC 2.5 Elastic-R AMI 2 R 2.9 + BioC 2..3 Elastic-R AMI 2 R 2.9 + BioC 2.3 Elastic-R AMI 3 R 2.8+BioC 2.0 Elastic-R EBS 4 Data Set VVV Amazon Elastic Block Stores Elastic-R.org Elastic-R AMI 2 R 2.9 + BioC 2.3 Elastic-R EBS 4 Data Set VVV Elastic-R EBS1 Data Set XXX Elastic-R EBS 2 Data Set YYY Elastic-R EBS 3 Data Set ZZZ Elastic-R EBS 4 Data Set VVV

  26. Stateful generated Web Services delivered by snapshottable/archivable virtual appliances T1 getData T2 T3 Login SessionIDassociated with a reserved Elastic-R Engine LogOn Pwd Options ES f ( ES ) ESon2 ESon3 ESon1 Retrieve Data T1,T2,T3 : GeneratedStateful Web Services for R functions T1,T2 & T3 LogOn,getData : R-SOAP methods ES : ExpressionSet ESon1, ESon2, ESon3 : ExpressionSet Object Names f = T3 o T2 o T1 logOff • removeESonx • « Clean » Elastic-R Engine • Put Elastic-R Engine back in the Pool • killElastic-R Engine

  27. Demo

  28. Cloud Computing and the Simplification/Standardization of the Scientific Applications’ Life Cycle

  29. Users can create easily Java GUIs that use the full capabilities of a stateful and remote R engine and share them as URLs Elastic-R AJAX Workbench Visual Graphic User Interface Builder Uploadplugin Standalone Application Accessible From a URL Elastic-R Java Workbench • Plugins Repository • myPlugin • myDashboard

  30. Demo

  31. Links • Elastic-R Portal : • www.elastic-r.org • Articles about the project: • Chine K. (2010). Open Science in the Cloud: Towards a Universal Platform for Scientific and Statistical Computing. In Handbook of Cloud Computing. (Chapter 19). Springer US. • Karim Chine, "Learning Math and Statistics on the Cloud, Towards an EC2-Based Google Docs-like Portal for Teaching / Learning Collaboratively with R and Scilab," icalt, pp.752-753, 2010 10th IEEE International Conference on Advanced Learning Technologies, 2010 • Karim Chine, "Scientific Computing Environments in the age of virtualization, toward a universal platform for the Cloud" pp. 44-48, 2009 IEEE International Workshop on Open Source Software for Scientific Computation (OSSC), 2009 • Karim Chine, "Biocep, Towards a Federative, Collaborative, User-Centric, Grid-Enabled and Cloud-Ready Computational Open Platform" escience,pp.321-322, 2008 Fourth IEEE International Conference on eScience, 2008 • Linkedin Group: • http://www.linkedin.com/groups?home=&gid=2345405

  32. Acknowledgments ACS: MadiNassiriAmazon: Simone Brunozzi, Deepak Singh AT&T Research Labs: Simon UrbanekATUGE: ImenEssafi, BéchirTourki, IlyesGouja, HatemHachicha, Amine ElleuchAuckland Centre for eResearch: Nick Jones Banca d'Italia: Giuseppe Bruno Bio-IT World: Kevin Davies BNP Paribas: OusseynouNakoulimaCambridge Healthtech Institute: Cindy CrowninshieldCity University of New York: Mario Morales, MakramTalihColumbia University: Omar BesbesDassaultSystèmes: Omri Ben Ayoun, Patrick Johnson Dataspora: Michael E. Driscoll EDF: Alejandro RibesEBI: AlvisBrazma, Wolfgang Huber, KimmoKallio, MishaKapushesky, Michael Kleen, Alberto Labarga, Philippe Rocca-Serra, UgisSarkans, Kirsten Williams, Eamonn Maguire EPFL: Darlene Goldstein ESPRIT: Farouk Kammoun, Tahar. Benlakhdare-Taalim: NadhirDoumaETH Zürich: YohanChalabi, DiethelmWürtz, Martin MächlerEuropean Commission: KonstantinosGlinos, EnricMitjana, Monika Kacik, IoannisSagiasFHCRC: Martin Morgan, Nianhua Li, Seth Falcon Google: Olivier BosquetFVG LLC: Lisa Wood Harvard University: Tim Clark, Sudeshna Das, Douglas Burke,PaoloCiccareseIBM: Jean-Louis Bernaudin, Pascal Sempe, Loic Simon, Lea A Deleris, Alex Fleischer, Alain ChabrierImperial College London: AsifAkram, VasaCurcin, John Darlington, Brian Fuchs Indiana University:MichaelGrobeINRIA: David Monteau, Christian Saguez, Claude Gomez, SylvestreLedruJISC: John Wood, David Flanders Johnson & Johnson - Janssen Pharmaceutica: Patrick MarichalKXEN: Eric MarcadeLancaster University: Robert Crouchley, Daniel GroseLeibniz Universität Hannover: KorneliusRohmeierLIAMA:Baogang Hue, Kang CaiLimagrain: ZivanKaramanMekentosj: Alexander Griekspoor, Matt Wood Microsoft: Eric Le Marois, Tony Hey Mubadala: Ghazi Ben Amor Nature Publishing Group: Ian Mulvany, Steve Scott NCeSS: Peter Halfpenny, Rob Procter, MarziehAsgari-Targhi, Alex Voss, YuWei Lin, Mercedes ArgüelloCasteleiro, Wei Jie, MeikPoschen, Katy Middlebrough, Pascal Ekin, June Finch, FarzanaLatif, Elisa Pieri, Frank O'Donnell New York Java User Group: Frank D Greco OeRC: Dimitrina Spencer, MatteoTurilli, David Wallom, Steven Young OMII-UK: Neil Chue Hong, Steve Brewer OpenAnalytics: Tobias VerbekeOracle: Dominique van Deth, Andrew Bond OSS Watch: Ross GardlerPlatform Computing: Christopher Smith Royal Society: James WilsdonSan Diego Supercomputer Center: Nancy R. Wilkins-DiehrSanger Institute: Lars Jorgensen, Phil Butcher Shell: Wayne.W.Jones, Nigel Smith SociétéGénérale: Anis MaktoufStanford University: John Chambers, BalasubramanianNarasimhan, Gunter Walther SYSTEM@TIC: KarimAzoumTechnischeUniversität Dortmund: UweLigges, Bernd BischlTechnoforge: Pierre-Antoine DurgeatTekiano: Samy Ben NaceurTélécom-ParisTech: Isabelle Demeure, Georges Hebrail, NesrineGabsiThe Generations Network: Jim PorzakTotal: YannickPerigoisTunisian Ministry of Communication Technologies: NaceurAmmar, LamiaChaffai-Sghaier, Mohamed SaïdOuerghi, SyrineTliliTunisian EcolePolytechnique: RiadhRobbanaUC Berkeley: Noureddine El Karoui, Terry Speed UC Davis: Rudy Beran, Debashis Paul, Duncan Temple Lang UCL: Daniel JeffaresUCLA: IvoDinov, JeroenOomsUC San Diego: Anthony GamstUCSF: Tena Sakai UniversitéCatholique de Louvain: Christian Ritter University of Cambridge: Ian Roberts, Robert MacInnis Peter Murray-Rust, Jim Downing University of Manchester: Carole Goble, Len Gill, Simon Peters, Richard D Pearson, Iain Buchan, John Ainsworth University of Plymouth: Paul HewsonUniversity of Split: IvicaPuljakUTK: Ajay OhriWorld Bank Group-IFC: OualidAmmarYahoo: Laurent Mirguet, Rob WeltmanIndependant:Charles Dallas, Romain François

More Related