1 / 17

Karim Chine, Cloud Era Ltd 13 December 2013

e-Age 2013. Elastic-R. , T owards a universal platform for research and education in the cloud. Karim Chine, Cloud Era Ltd 13 December 2013. Outline. Introduction

tom
Download Presentation

Karim Chine, Cloud Era Ltd 13 December 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. e-Age 2013 Elastic-R , Towards a universal platform for research and education in the cloud Karim Chine, Cloud Era Ltd 13 December 2013

  2. Outline • Introduction • Elastic-R: Rethinking virtual research and teaching • Demo • Conclusion

  3. Introduction • Science and the 4th paradigm • The e-Research dancing bears Experimental Science Theoretical Science Computational Science e-Science / Data-intensive Science The townspeople gather to see the wondrous sight as the massive, lumbering beast shambles and shuffles from paw to paw. The bear is really a terrible dancer, and the wonder isn't that the bear dances well but that the bear dances at all.

  4. Introduction The rise of data science Data science incorporates varying elements and builds on techniques and theories from many fields, including math, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products * http://en.wikipedia.org/wiki/Data_science

  5. Introduction Fragmentation and friction in the data science arena www.scipy.org www.python.org www.scilab.org • Open-source (GPL) software environment for statistical computing and graphics • Lingua franca of data analysis. • Repositories of contributed R packages related to a variety of problem domains in life sciences, social sciences, finance, econometrics, chemo metrics, etc. are growing at an exponential rate. • R is Super Glue www.sagemath.org www.wolfram.com www.mathworks.com office.microsoft.com www.spss.com www.sas.com http://root.cern.ch

  6. Introduction The Next Generation Data Science Platform Arduino/ Raspberry pi Democratizing electronics Elastic-R Democratizing data science

  7. Introduction The cloud and its capabilities • 3D printers are becoming a common place: • Creating three-dimensional solid object of virtually any shape from a digital model. • Scripting the physical world • Sharing physical reality on Facebook is now as easy as sharing your holiday pictures

  8. Elastic-R: Towards a universal platform for data science Computational Components R packages, Wrapped C,C++,Fortran code, Python modules, MatlabToolkits… Open source or commercial ComputationalResources Clusters, grids, private or public clouds Free or pay-per-use ComputationalGUIs HTML5 and Desktop Workbench Built-in views/Plugins /Collaborative views Open source or commercial Computational Storage Local, NFS, FTP, Amazon S3, EBS Computational Scripts R / Python / Matlab / Groovy ComputationalAPIs Java / SOAP / REST, Stateless and stateful GeneratedComputational Web Services Stateful or stateless, mappingof R objects/functions

  9. Elastic-R: Towards a universal platform for data science Robot submarine dives to the deepest part of the ocean controlled by a 7-mile cable as thin as single human hair

  10. Elastic-R: Towards a universal platform for data science

  11. Elastic-R: Towards a universal platform for data science Public Clouds Private Cloud

  12. Elastic-R: Design and technologies overview Elastic-R AMI 1 R 2.10 BioC 2.5 Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R AMI 3 R 2.8 BioC 2.0 Elastic-R EBS 4 Data Set VVV Elastic-R Amazon Machine Images Elastic-R EBS 1 Data Set XXX Elastic-R EBS 4 Data Set VVV Elastic-R.org Eastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R EBS 3 Data Set ZZZ Elastic-R EBS 2 Data Set YYY Elastic-R EBS 4 Data Set VVV Amazon Elastic Block Stores

  13. Elastic-R: The scriptability framework Command Line API Web Console SDK

  14. Elastic-R: The scriptability framework Command Line API Web Console SDK

  15. Demo • Register to Elastic-R academic and trial portal (www.elasticr.com ) • Create datascience engines using trial tokens • Work with R, Python and scientific Spreadsheets in the browser • Share Data Science Engine and Collaborate • Use The Visual and Collaborative Scientific Applications designer to create and publish to the web an interactive dashboard • Connect to the remote Data Science Engine from withing a local R session, push and pull data, execute commands and show impact on the dashboard

  16. Conclusion • Elastic -R unlocks the potential of the cloud for Data scientists and educators • With Elastic-R, the cloud becomes a cyberspace for collaborative research and sharing and an eco-system suited for open Science, open innovation and open education • Elastic-R improves dramatically the productivity of the data scientists: The entire data science factory chain, from resources acquisition to services and applications publishing, becomes under their direct control • Elastic-R provides Analytics-as-a-Service platform that can extend any existing portal or application

  17. Contact details Karim Chine karim.chine@cloudera.co.uk

More Related