1 / 21

Developing JWST Pipelines at STScI

Developing JWST Pipelines at STScI. Robert Jedrzejewski. Who we are. The Science Software Branch at STScI 16 members Most have an astronomy background 6 have PhDs Combined experience in group: 125 years Combined experience at STScI: 200 years. What we do.

claire
Download Presentation

Developing JWST Pipelines at STScI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing JWST Pipelinesat STScI Robert Jedrzejewski

  2. Who we are • The Science Software Branch at STScI • 16 members • Most have an astronomy background • 6 have PhDs • Combined experience in group: 125 years • Combined experience at STScI: 200 years

  3. What we do • Develop HST calibration pipelines • STSDAS/TABLES • PyRAF, PyFITS, STScI_Python • HST Exposure Time Calculators • Other smaller projects (Gemini/GOODS/Hubble Legacy Archive/GoogleSky/JWST Backplane Stability…)

  4. Development Experience • Python • Java • C/C++ • Fortran • spp/cl • IDL • (Perl/Assembly/Tcl…)

  5. Our preferred development model • Python! • We find we can be extremely productive writing in Python • Speed is occasionally an issue, so we use C extensions when necessary • Very little pipeline code requires performance optimization

  6. Development style • Use version control (subversion) • Use regression tests + nightly builds + web reporting tools • Trac for problem tracking/wiki for information dissemination • Unit/doc tests • Multiple platforms (Linux/Mac/Solaris/Windows)

  7. How we did HST pipelines • Calfoc, calfos, calhrs, calwfpc, calwp2 • First generation pipelines, written in spp, read GEIS files • Calstis, calnic(a/b) • Second generation, written in C using hstio (which wraps IRAF imio libraries) to read multiple extension FITS files • Calacs • Borrowed much code from calstis imaging • Calwfc3 • Borrowed much code from calacs, calnic • Calcos • Third generation, written in Python (+ c where needed) • Later pipelines were more likely to be used by IDTs for calibrating ground test data

  8. More on HST pipelines • Pipeline operation is data-driven • Calibration steps as header keywords: • FLATCORR=PERFORM/OMIT/COMPLETE/SKIPPED • Reference file names as header keywords • FLATFILE=oref$g2342212_flt.fits • This decouples some of the intelligence from the code • No need to rebuild code if step or reference file changes

  9. Multidrizzle • Multidrizzle is used by the ACS and WFPC2 pipelines to combine images with small position offsets (dithered), removing cosmic rays • It is a Python application that can be used with ACS, STIS, WFPC2, NICMOS and WFC3 data • This breaks from our ‘tradition’ of having 1 calibration pipeline program for each instrument

  10. Input stage Calibration Step Reference File Output stage How we see the JWST Pipelines • A series of calibration steps

  11. Early design ideas • No need to have separate pipeline programs for each JWST instrument • Many calibration steps depend on detector, and JWST instruments use detectors of the same type • We can use the same code, instead of having to replicate it (and maintain it) in more than one place • Some calibration steps will probably be identical for all JWST data (e.g. the MASKCORR step, where a static mask from a reference is applied to the DQ array of the data)

  12. Try not to make the mistakes we made with HST • Use the same keywords for the same quantities • Use the same file/association structure • Use the same algorithms to do the same calibration • Unless a team shows that a given algorithm does not work for their instrument • Even then, try and keep as much code common as possible, only breaking out the code that is different • Sometimes it is possible to encapsulate the differences in the reference files, keeping the code the same

  13. JWST Pipelines (continued…) • Python gives us object-oriented capabilities • ‘input_stage’ and ‘output_stage’ are objects that encapsulate information on their state and on how to calibrate themselves • For example, they might be NIRSPEC IFU data objects, or MIRI imaging data objects • When executing a given step, they may use their own custom method, or else defer to a method that they inherit from a more ‘generic’ datatype • E.g. MIRI imaging data and NIRCAM imaging data may both use the flatfield() method of the JWSTImagingData class, from which they both inherit

  14. JWST Pipelines (continued…) • The inheritance hierarchy encapsulates information about what is the same and what is different about JWST data types • We can mix in behaviors from different types of object, as necessary • But, to the extent that is possible, we try and keep as much the same as possible • The people who inherit this project will thank us

  15. What goes in? • IDTs and instrument teams at STScI will figure out: • Which steps are needed, and their ordering • Which instruments/modes use the steps • What each step does • What calibration reference data are needed • What tests the code needs to pass

  16. Facilitating the process • Calibration data will be in a “public” repository • This will include: • Code • Test data • Documentation

  17. Facilitating… • We will encourage everyone to try out our algorithms as we develop them • And we encourage everyone to contribute their own algorithms • We’ll handle keeping teams synchronized by versioning and providing different builds • E.g. Team A may still be testing build X, when team B needs to test the next stage of functionality in build X.1 • When Team B is ready to test the functionality in build X.1, there may already be build X.2 (which includes the functionality in build X.1 as well as new functionality) • In the end, all the teams will test the same code

  18. Facilitating • How do we know that the code does the ‘right’ thing? • Teams provide test data with test results • Then we know that the result is correct because it reproduces team-supplied answers • Test results could be actual data (e.g. FITS files) • Pixels in pipeline-calibrated data should be identical within +/- • Or results of analysis • Aperture photometry should be the same to within +/-

  19. Interfacing with other languages • If teams develop code that does a lot of fancy processing, we can try to include it by wrapping • Python talks to C/C++ using C extensions • An existing C function can be wrapped so that Python objects can be passed to C/C++, and C objects passed back to Python • We can wrap relatively simple C functions • Arguments are arrays or primitive datatypes (integer/float/string…) • No objects as arguments • Structs are OK, as long as they are simple (flat) • Play nice with memory

  20. Wishlists • We don’t need to feel constrained by HST • What are the biggest deficiencies in HST? • Best reference files and best calibration steps can be determined by querying a service • Don’t need to rely on HST archive to find these out • Reference files can be downloaded as needed • Even calibration code can be updated as needed (don’t need to wait 6 months for the next STSDAS release)

  21. Wishlists • Tell us what you want! • The earlier the better • Some aspects of the overall architecture are still flexible • And not just pipeline calibration code • We are going to need tools for data analysis, evaluation, interpretation, visualization • Reference file generation

More Related