160 likes | 347 Views
New Software for Ensemble Creation in the Spitzer-Space-Telescope Operations Database. Russ Laher and John Rector 2004 ADASS XIV Conference October 24 - 27, 2004. Preface.
E N D
New Software for Ensemble Creation in the Spitzer-Space-Telescope Operations Database Russ Laher and John Rector 2004 ADASS XIV Conference October 24 - 27, 2004
Preface • About one third of the 230 Spitzer data-processing pipelines require multiple input images (e.g., calibrations, image co-adds & mosaics) • Motivation is data noise reduction and/or statistical characterization of the data • Input images are grouped for particular pipeline processing into what we call “ensembles” in the operations database
Outline • Powerpoint Presentation • Introduction • Background • Purpose of Talk • Database storage of ensembles • Ensemble-creation rules • Ensemble-creation software • Conclusions • Future Work • URL of long version of paper http://spider.ipac.caltech.edu/staff/laher/sirtf/NewEnsembleCreation.pdf • Appendices • A. On-line software tutorial • B. Spitzer ensemble-creation rules • C. S/W output, test mode • D. S/W output, normal mode
Background • Spitzer rules for ensemble creation are well documented and under version control. • Spitzer pipeline-operator Ron Beck created the first version of a script for executing the ensemble-creation rules • Rules are hard coded (and therefore hard to change) • Direct SQL is used for DB access (open/close DB connection for each access) • New database-design improvements and software have been developed for increased speed and flexibility
Purpose of this Talk • To acquaint you with SSC methodologies for creating/storing ensembles, including • Database design • “Ensemble-creation” rules • Debut our new ensemble-creation software • New database tables and schema changes • New database stored functions • Identify general concepts used in creating/storing ensembles (for application to other astronomical missions)
Hierarchy of Spitzer Observations In “cluster” mode, there may be multiple exposures per cluster of observations (clusterPosNum) At scheduling time, the “pipeline picker” assigns to each DCE a pipeline for initial processing (initPlScriptId)
Miscellaneous Considerations • Ensembles can be created in the database after the observations are scheduled (it is not necessary to have received the actual DCEs from the spacecraft) • Wouldn’t it be nice to store with each ensemble in the database information about the “rule” applied in creating it?
Database Storage of Ensembles • There are three database tables for storing information about how (instances of) ensembles are defined (which DCEs are included and how they are to be processed) • DCEs are grouped explicitly into DCE sets (via association of dceIds with an dceSetId) • The type of pipeline ensemble processing to be done is stored with the ensemble (plScriptId is assocated with ensId)
Database Storage of Ensembles (cont.) • A DCE set is stored with one or more ensembles (dceSetId is associated with ensId) • An ensemble is characterized in the database by dceSetId and plScriptId • Two or more ensembles can be associated together for processing a set of ensembles by creating a new ensemble with NULL dceSetId and two or more associations in the ensembleSets database table
DB Storage of Ensemble Rules • There are two database tables for storing ensemble-creation rules • The ensRules database table specifies how DCEs are to be grouped • The ensPlScripts database table specifies how a set of DCEs is to be processed (by one or more different pipelines)
Features of ensembleCreation.pl • Much faster performance is expected because pre-compiled database stored functions are called • Efficient architecture: only a single database connection is needed • Software complexity is encapsulated in the database stored functions • Database-table-driven specification of ensemble-creation rules makes it flexible • On-line tutorial (lists options, switches, sample command lines) • Useful, thoughtfully-organized diagnostic outputs • Test mode to verify effect of ensemble-creation rule, without actually having to create ensembles in the database • Post-mortem debugging capability via direct SQL querying of database temporary tables
Conclusions • Increased speed in creating database records for ensembles is achieved by using database stored functions • Flexibility in adding/changing ensemble-creation rules is achieved by storing the rules in the database • Several “small improvements” were implemented, as well (e.g., storing the minimum number of DCEs with the ensemble-creation rule, storing the corresponding ruleId with each ensemble in the database)
Future Upgrades • Add new option to execute selected ensemble-creation rules • Specify comma-separated list of ruleIds • Application is augmenting existing set of ensembles • Add new option to create ensembleSets from existing ensembles • Specify ruleId and ensPlScriptId • Application is linking together existing ensembles (e.g., process the data for all reqKeys in a given 12-hour PAO to flag pixels with latent images)