420 likes | 427 Views
09:00 - 09:30 Jupyter Notebook and Ecosystem 30', Hans Fangohr (European XFEL GmbH) 09:30 - 10:00 Status updates from facilities Jupyter at Synchrotron Soleil (Alain Buteau & Gwenaelle Abeille) 5' Jupyter at CERN (Manuel Gonzalez & Jakub Wozniak) 5'
E N D
09:00 - 09:30 Jupyter Notebook and Ecosystem 30', Hans Fangohr (European XFEL GmbH) • 09:30 - 10:00 Status updates from facilities • Jupyter at Synchrotron Soleil (Alain Buteau & Gwenaelle Abeille) 5' • Jupyter at CERN (Manuel Gonzalez & Jakub Wozniak) 5' • Jupyter at European Southern Observatory (Gianluca Chiozzi) 5' • Jupyter at J-PARC MLF (Kentaro Moriyama) 5’ • Jupyter at MAX IV Synchrotron (Vincent Hardion) 5’ • 10:00 - 10:30 Coffee break • 10:30 - 12:30 JupyterLab Tutorial 2h0', Saul Shanabrook (Quansight) • 12:30 - 14:00 Lunch and networking • 14:00 - 14:30 Jupyter for processing neutron event data at ESS 30', Dr. Jonathan Taylor (European Spallation Source) • 14:30 - 15:00 Jupyter at Brookhaven National Laboratory 30', Dr. Daniel B. Allan (Brookhaven National Lab) • 15:00 - 15:20 Jupyter for Accelerator Physics 20', Dr. Jonathan Edelen (RadiaSoft LLC) • 15:30 - 16:00 Coffee break • 16:00 - 17:00 Questions and answers, discussion, Hans Fangohr (European XFEL GmbH)
Introduction Jupyter notebook and ecosystem Hans Fangohr Data Analysis European X-ray Free Electron Laser (EuXFEL), Germany Professor of Computational Modelling University of Southampton, United Kingdom hans.fangohr@xfel.eu @ProfCompMod Jupyter Workshop, ICALEPCS 2019, New York, US, Saturday 5 October 2019 https://indico.desy.de/indico/event/23354
Outline • Quick introductionJupyter Notebook • Introduce a numberofusecases (mostfrom European XFEL) • Introduce a numberoftoolsfrom Project Jupyter Format (fortheday) • Informal • Askquestionstomakeit informal • Toguidethepresenterwhatisuseful • Slides will bemadeavailable on Workshop site (speakersplease send to Hans Fangohr)
Jupyter Notebook • Document hosted in web browser (demo) • Combines • text (markdown with LaTeX support) • Computer code (Python) • Output from code • Saved in one document • *.ipynb [IPYthonNoteBook] • Combines input and output cells • JSON format
Toolbar Notebook name Kernel name Markdown cell Currently selected cell Code cell Code output
History of Jupyter • IPython IPYthonNoteBook *.ipynb • IPython notebooks were language agnostic, as they run over open network protocols • The community began adding other languages to the notebooks, starting with Julia and R • As the project expanded away from just Python, the notebooks had to be renamed • JUlia, PYThon, and R Jupyter • “Jupyter” is also a homage to Galileo’s notebooks which recorded the discovery of Jupiter’s moons
How does it work? • Starting jupyter-notebook starts the notebook server • Opening a notebook starts up the kernel • The kernel communicates to the notebook server via ZMQ • The notebook server shows content via the browser • JavaScript used in Browser
Supported Kernels • 50+ languages supported. • More complete list at • https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
Use case 1: data analysis in notebook • Explorative data analysis • Convenient combination of processing, results and interpretation • Complete capture of all computational steps • good record for reproducibility and re-use • FAIR data • Through export to HTML, easy to share with collaborators & supervisors • Scientists are confident drivers of this • Example on the right from SCS instrument
Sharing andexportingnotebooks • Can share *.ipynb files • Can be displayed using jupyter-notebook • Code can be re-executed on collaborator‘s machine • Only if software is available, and data is available, and collaborators know how to start notebook • “Static sharing“ through html and email (for example) • Often sufficient – does not require installation or additional skills • Effective to communicate with supervisors, line managers etc • Convert using menu “File -> Download as -> html“ • Or using nbconvert: $ jupyter-nbconvert --to html 2-widgets.ipynb [NbConvertApp] Converting notebook 2-widgets.ipynb to html [NbConvertApp] Writing 382609 bytes to 2-widgets.html
Usecase 2: notebooksasrecipes • Pre-populate notebook with cells to carry out a particular type of data analysis • Provide a directory full of such recipes to users • Users execute cells during beamtime and later • Convenient compromise between • Static recipe (=script) • Interactive exploration • Experience • Keep code in notebook cells short and • move functionality into library (here “ToolBox“) • Archive directory of modified recipes with data
Use case 4: notebooks as a script • Use Jupyter Notebook as a script • Can execute using nbconvertto take commands in notebook, execute them, save resulting notebook. • Can create data files and plots in process. • Use case: detector calibration pipeline • Use of nbparametrize (or papermill) to insert run parameters into notebook before execution • Automatic execution and creation of pdf • Error messages embedded in output
Executing a notebook from the command line (nbconvert) • Convert from ipynb to html: $ jupyter-nbconvert --to html 2-widgets.ipynb • Can optionally set output name$ jupyter-nbconvert --to html --output myout.html 2-widgets.ipynb • Can execute notebook before conversion: $ jupyter-nbconvert --execute --to html --output myout.html 2-widgets.ipynb • Execute notebook and save results as notebook: $ jupyter-nbconvert --execute --to ipynb --output myout.ipynb 2-widgets.ipynb [NbConvertApp] Converting notebook 2-widgets.ipynb to ipynb [NbConvertApp] Executing notebook with kernel: python3 [NbConvertApp] Writing 103398 bytes to myout.ipynb
nbconvert • nbconvert is a tool which can convert notebooks to various other formats such as: • HTML, LaTeX, PDF, Markdown, ReStructuredText, Python, … • Static HTML pages can be created as documentation or tutorial • Nbsphinx-> Jupyter notebook provides section in sphinx documentation • Homepage: https://github.com/jupyter/nbconvert • Notebook as a script approach • Can change parameters inside notebook before execution: • nbparametrize or papermill
Use case 5: (remote) data analysis environment (JupyterHub) • JupyterHub allows to • users connect through browser and https • use existing authentication systems • serve notebooks on facility hardware • connect to user’s file storage • Example: Jupyter Hub at EuXFEL & DESY • Uses Maxwell HPC cluster • Popular with users: • no software installation & browser of choice • works locally and remotely the same
Using remote resources: JupyterHub • In principleusercansshto HPC resource, useportforwardingtoconnectlocalmachinewith HPC computerrunningnotebookserver • JupyterHub helps orchestrate and manage individual Jupyter instances for multiple users • It provides an interface allowing users to easily spawn and connect their own Jupyter Server • Different options for resource allocation • Integrated with HPC scheduler • . . .
JupyterHub – manage individual Jupyter instances for multiple users
JupyterHub – manage individual Jupyter instances for multiple users
Use case 6: blending GUI and script • JupyterWidgets provide graphical control elements in notebook • Buttons, sliders etc trigger code execution and update of plot • Useful for • Data analysis of fixed type • Data exploration of data sets • Discussion • Less powerful than, for example, Qt GUI • Popular with users due to • Being embedded in notebook • No software installation (via JupyterHub)
Binder project • Given a (github) repository with • Jupyter notebooks • Software requirements (Dockerfile, requirements.txt, environment.yml) • Binder service • builds a container with the required software • starts Jupyter notebook server in that container offering the notebooks • Binder project provides free pilot at • https://mybinder.org • Institutional Binder instances are being deployed Example: https://github.com/fangohr/jupyter-demo
Use case 7: documenting software library • Use notebook as chapter in documentation • Supported by sphinx html, pdf as usual • Documentation easy to create: • enter commands in notebook • ouput is produced automatically • updating docs means re-running notebook • Can run regression test on documentation notebooks using NoteBookVALidate (nbval)
Use case 7: documenting software library • Use notebook as chapter in documentation • Supported by sphinx html, pdf as usual • Documentation easy to create: • enter commands in notebook • ouput is produced automatically • updating docs means re-running notebook • Can run regression test on documentation notebooks using NoteBookVALidate (nbval) • With Binder, can make documentation executable (for example DiscretisedField Tutorials)
nbval • py.test is a popular Python testing framework • nbval is a py.testplugin which lets py.test recognise and collect Jupyter notebooks • In each notebook, each cell is a test: • The test passes if execution of the input creates the stored output • The test fails otherwise • There is a variety of configuration parameters • Home page: - https://github.com/computationalmodelling/nbval
nbval example $ py.test --verbose --nbval 2-widgets.ipynb ============================= test session starts ============================== platform darwin -- Python 3.6.8, pytest-3.10.0, py-1.7.0, pluggy-0.8.0 -- /Users/fangohr/anaconda3/bin/python rootdir: /Users/fangohr/Desktop/jupyter-demo, inifile: plugins: remotedata-0.3.1, openfiles-0.3.0, doctestplus-0.1.3, arraydiff-0.2, nbval-0.9.1 collected 9 items 2-widgets::ipynb::Cell 0 PASSED [ 11%] 2-widgets::ipynb::Cell 1 PASSED [ 22%] 2-widgets::ipynb::Cell 2 PASSED [ 33%] 2-widgets::ipynb::Cell 3 PASSED [ 44%] 2-widgets::ipynb::Cell 4 PASSED [ 55%] 2-widgets::ipynb::Cell 5 PASSED [ 66%] 2-widgets::ipynb::Cell 6 PASSED [ 77%] 2-widgets::ipynb::Cell 7 PASSED [ 88%] 2-widgets::ipynb::Cell 8 PASSED [100%] ===================== 9 passed, 1 warning in 2.11 seconds ====================== (base) [20:49:09] fangohr:jupyter-demo git:(master*) $
Use case 8: reproducible publication • Create github repository to complement publication • Create one notebook per figure / main result • Define software environment in github repository using Binder syntax • Close to reproducible publication: • fully specified software environment • fully specified data analysis • Data access i Andy Götz talk, PaNOSC • Zenodo for long term preservation • Create Zenodo deposit for repository • Cite Zenodo DOI in publication Example: https://github.com/maxalbert/paper-supplement-nanoparticle-sensing
Jupyter Lab • Next generationJupyternotebookinterface • Windowmanagerembedded in browser • “classic notebook“ in onewindow • Additional features in otherwindows • File browser • Extensions • CSV viewer.
Multiple tabs Multiple panels Explore files (e.g. CSV) Table of Contents extension Dark theme!
Jupyter Notebook vs. JupyterLab • JupyterLab is the newer interface to the notebooks • It provides a more flexible interface closer to a modern IDE • More on this later in the day • Flexible layouts and panes • More flexible console/text editor • Drag/Drop/Expand/Collapse cells • Themes • Extensions
Nbdime – NoteBookDIff and MErge • Jupyter notebooks are rich media documents, stored as plain text JSON files • Basic diff and merge tools (such as that used by git) do not handle this format well • Small changes to text/plots result in unreadable diffs • nbdime provides multiple tools to help with “content-aware” diffing and merging for Jupyter notebook files • nbdiff compare notebooks in a terminal-friendly way • nbmerge three-way merge of notebooks with automatic conflict resolution • nbdiff-web shows you a rich rendered diff of notebooks • nbmerge-web gives you a web-based three-way merge tool for notebooks • nbshow present a single notebook in a terminal-friendly way • Homepage: -https://github.com/jupyter/nbdime
Summary – tools • Jupyter Notebook • Jupyter Lab – next generation Jupyter user interface • Jupyter Hub – serving notebook from compute facility for multiple users • NBDIME – DIffing and MErging tools • NBVAL – VALidation tool; use each cell as a test • NBCONVERT – conversion of notebooks to other formats & execution • NBParametrize and Papermill – inject parameters into notebook files • IPyWidgets – GUI like elements in notebook • Binder – Cloud hosted execution of notebooks from github repositories • There is much more.
Summary – use cases • Use cases • Data analysis • Provision of recipes • Notebook-as-a-script • Remote data analysis (JupyterHub) • Mixing GUI and script-driven analysis • Documentation of Software • Reproducibility • . . .
Summary • Jupyter notebook and ecosystem provides many options • Acknowledgements • Authors and co-authors of ICALEPCS contribution TUCPR02 (Tuesday 14:30) • Robert Rosca and Thomas Kluyver • OpenDreamKit Horizon 2020, European Research Infrastructures project (#676541), http://opendreamkit.org • PaNOSC: Photon and Neutron Open Science Cloud, European Union’s Horizon 2020 research and innovation programme under grant agreement No 654220 • The Gordon and Betty Moore Foundation through Grant GBMF #4856, by the Alfred P. Sloan Foundation and by the Helmsley Trust. • EPSRC's Centre for Doctoral Training in Next Generation Computational Modelling, http://ngcm.soton.ac.uk (#EP/L015382/1),
09:00 - 09:30 Jupyter Notebook and Ecosystem 30', Hans Fangohr (European XFEL GmbH) • 09:30 - 10:00 Status updates from facilities • Jupyter at Synchrotron Soleil (Alain Buteau & Gwenaelle Abeille) 5' • Jupyter at CERN (Manuel Gonzalez & Jakub Wozniak) 5' • Jupyter at European Southern Observatory (Gianluca Chiozzi) 5' • Jupyter at J-PARC MLF (Kentaro Moriyama) 5’ • Jupyter at MAX IV Synchrotron (Vincent Hardion) 5’ • 10:00 - 10:30 Coffee break • 10:30 - 12:30 JupyterLab Tutorial 2h0', Saul Shanabrook (Quansight) • 12:30 - 14:00 Lunch and networking • 14:00 - 14:30 Jupyter for processing neutron event data at ESS 30', Dr. Jonathan Taylor (European Spallation Source) • 14:30 - 15:00 Jupyter at Brookhaven National Laboratory 30', Dr. Daniel B. Allan (Brookhaven National Lab) • 15:00 - 15:20 Jupyter for Accelerator Physics 20', Dr. Jonathan Edelen (RadiaSoft LLC) • 15:30 - 16:00 Coffee break • 16:00 - 17:00 Questions and answers, discussion, Hans Fangohr (European XFEL GmbH)