380 likes | 557 Views
HMI-AIA JSOC Science Data Processing (SDP) Readiness Art Amezcua, JSOC Software Lead. AGENDA JSOC-SDP Overview JSOC-SDP Status (H/W, S/W) Pipeline processing Database Level-0, Level-1 and higher levels AIA Visualization Center JSOC-SDP Maintenance and CM Documentation Staffing
E N D
HMI-AIA JSOC Science Data Processing (SDP) ReadinessArt Amezcua, JSOC Software Lead AGENDA • JSOC-SDP Overview • JSOC-SDP Status (H/W, S/W) • Pipeline processing • Database • Level-0, Level-1 and higher levels • AIA Visualization Center • JSOC-SDP Maintenance and CM • Documentation • Staffing • Summary
JSOC Science Data Processing (SDP) Status • JSOC-SDP supports both HMI and AIA through Level-1 and HMI through Level-2 science data products • JSO-SDP Infrastructure is complete • JSOC-SDP Hardware is complete, upgrades in process • Database systems – Warm standby system online in August 2009 • Web Server – Upgrade online in August 2009 • JSOC-SDP Software • Data Record Management System and Storage Unit Management System (DRMS/ SUMS) complete as of March 2009 • JSOC-SDP Archive System is fully operational • Software Components needed to support commissioning • Level-0 Image processing for both AIA and HMI is ready and was used to support Observatory I&T • Level-0 HK, FDS and other metadata merge – complete as of May 2009 • Level-1 (science observables) – will be completed during commissioning • HMI Doppler and LOS Magnetic – 95% complete • HMI Vector Field Observables – 90% complete • AIA Level-1.5 Images – 50% complete
JSOC-SDP Status (continued) • Software components needed to support science mission: • Production Pipeline Manager – In development, expected during commissioning • HMI Level-2 (Version 1 of science data products) • Local Helioseismology – Work in parallel on “rings”, “time-distance”, and “holography” proceeding with basic capability, expected to be ready during commissioning • Global Helioseismology – Ready for testing during commissioning • Magnetic Field standard products – Ready for testing during commissioning • Vector Field disambiguation – 80% complete with preliminary product ready by end of commissioning (requires real data to proceed) • Export and Catalog Browse Tools • Functional but needs work (http://jsoc.stanford.edu/ajax/lookdata.html) • Refinements will continue • All science products need flight data during commissioning to complete development • AIA Visualization Center (AVC) at Lockheed Martin • Higher-level AIA processing and science product generation • Heliophysics Event Knowledgebase (HEK) • Stanford Summary: On schedule for L – 4 and Phase E – 6 months
HMI and AIA JSOC Overview GSFC White Sands housekeeping LMSAL MOC DDS House- keeping Database HMI & AIA Operations Stanford HMI JSOC Pipeline Processing System Redundant Data Capture System Quicklook Viewing JSOC-IOC 19-Day Archive Primary Archive Local Archive AIA Analysis System Catalog JSOC-AVC High-Level Data Import Offline Archive Data Export & Web Service World Offsite Archive LMSAL Science Team Forecast Centers EPO Public JSOC-SDP
JSOC-SDP Data Center • Facility • Located in the climate-controlled basement of Physics and Astrophysics building at Stanford • Important components on UPS and building back-up power; databases auto-shutdown on power outage • Critical components monitored for failure (telephone, email, webpage notification of issues) • Components • 3 data-capture machines (plus one at MOC) • 1 data-processing cluster (512 CPU cores, 64 nodes, queuing system) • 1 file- and tape-server machine • 3 database machines • 2 gateway machines (to MOC, to SDP) • 1 web-server machine • 2 LMSAL real-time (housekeeping) machines • OC3 lines from DDS, ethernet (1 Gbps) connects all components, high-speed (20 Gbps) interconnect between data-processing cluster and file- and tape-server, 10 Gbps link between file- and tape-server and LMSAL.
16 16 16 4 4 4 4 2 Quad Core X86-64 Processors 8 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 8 8 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 2 Quad Core X86-64 Processors 10 TB 10 TB 10 TB 10 TB 13 TB 13 TB 13 TB 50 Slots 50 Slots 50 Slots SPARC SPARC 0.5 TB 0.5 TB 1G Link JSOC-SDP Major Components DDS GSFCAccess(FDS, L0HK) DRMS & SUMS Database Hosts Web Server OutsideWorld OC3 lines HMIDB HMIDB2 Web/Export Data Capture System ethernet (1 G) Pipeline Cluster 512 cores File/Tape Server RT Mon 10G Link HMI Spare AIA 400 TB Plus 150 TB py 2200-Slot Tape Library Plus 12 LTO-4 Drives HMISDP-mon Local Science Workstations Firewall LMSAL AIASDP-mon 10-G ethernet 20-G interconnect MOC link for real-time housekeeping X86-64 Processor Cores Disk LTO-4 Tape Library
Data Capture • Telemetry files transmitted to data-capture (DCS) machines via two OC3 lines • One line for AIA (data from four cameras over two virtual channels) • One line for HMI (data from two cameras over two virtual channels) • Three sets of telemetry: • DCS machines archive telemetry files to tape, driven to LMSAL twice a week and stored in cabinet • Production processes on dedicated cluster node ingest raw telemetry from DCS disk into hmi.tlm and aia.tlm • Dedicated cluster node creates Level 0 data from telemetry and stores in DRMS/SUMS as hmi.lev0 and aia.lev0 • DCS acks (once per day) DDS when offsite tape in place and verified and records in hmi.tlm created
S S S S C C C C I I I I S S S S C C C C O O O O Y Y Y Y S S S S T T T T E E E E M M M M S S S S Data Capture Details HMI Monitoring AIA Monitoring DDS HMI AIA OC3 MOC link Switches Pipeline Processing S I S Y T S HMI DCS Spare DCS AIA DCS Back End ethernet 10gig private link Spare DCS Tape Archive Disk array 13 TB (19 days) Tape Robot Operator hand net UPS UPS UPS Building Power w/ Generator Backup LMSAL MOC
Data Capture Status • Data capture machines • Online as of January 2008 • Each is capable of caching 19 days telemetry • Tape drive is used to generate offsite copy stored at LMSAL • Pipeline system is used to generate tlm copy and Level-0 data series • All tapes are LTO-4 (800 GB)
JSOC-SDP Pipeline Hardware • Data-processing cluster, file- and tape-server machine, and T950 tape library [COMPLETE as of July 2008]
Data-Processing Pipeline Status • All machines fully tested for integrity with simulated data at realistic data rate and volumes • All machines/components under service warranties with vendors • Database machines have been online for four years (during DRMS development) • Data-processing cluster, file- and tape-server machine, T950 tape robot, and tape systems went online in July 2008 • Upgrades (new machines onsite) • MOC web-access machine • solarport (gateway to SDP) • ftp server • web server • two database machines • in service in October 2009
AIA High-Level-Processing Hardware • AIA Visualization Center (AVC) • 200+TB Apple XSan • 10Gb link to Stanford • 10Gb link to LM science network • Quad HD display (3820x2160 pixels) • 9-panel Hiperspace datawall/cluster (7680x4800 pixels) • 22 node SGI XE Cluster • DRMS/SUMS interface for LM cache [COMPLETE as of June 2009]
Pipeline Software – DRMS/SUMS • Data Series • Related images and metadata stored in “data series” • Rows are data records (e.g., one record per time step) • Columns are keywords, pointers to data files (in SUMS), pointers to other data series • Storage Unit Management System (SUMS) • Image files (e.g., FITS files) stored in SUMS • Uses PostgreSQL database • Sole client is DRMS • Database Record Management System (DRMS) • Data series minus image files • Implemented as C library which wraps PostgreSQL database • Has a FORTRAN interface • Scientist interact directly with DRMS • NetDRMS • Network of DRMS/SUMS sites that share DRMS/SUMS data • DRMS data shared via RemoteDRMS, which uses Slony-1 to make data logs that are ingested at remote site • Data files residing in SUMS shared via RemoteSUMS, which uses scp; integrates with VSO so that data are obtained from least-congested NetDRMS sites
ingest_lev0 VC01 ingest_lev0 VC02 VC02*.tlm VC02*.qac VC05*.tlm VC05*.qac cl1n001 HMI hmi.tlm filename | SUDIR SUMS ingest_lev0 VC04 ingest_lev0 VC05 filename | SUDIR *.tlm *.qac NFS from dcs1 hmi.lev0 SUMS fsn | lev0 keys | SUDIR image.fitsimage_sm.fitsimage.png fsn | lev0 keys | SUDIR cl1n001 AIA aia.tlm filename | SUDIR SUMS filename | SUDIR *.tlm*.qac NFS from dcs0 VC01*.tlm VC01*.qac VC04*.tlm VC04*.qac aia.lev0 SUMS fsn | lev0 keys | SUDIR image.fitsimage_sm.fits image.png fsn | lev0 keys | SUDIR AIA ~24 images per VC01 or VC04 .tlm file. HMI ~16 images per VC02 or VC05 .tlm file. Level-0 Processing • Reconstructs images from tlm; no modification of CCD pixels [COMPLETE as of August 2008]
Level-1 Processing fetch level-0 keywords & segments get readout mode corrections read flatfield arguments interpolate predicted orbit vectors interpolate spacecraft pointing vectors hmi.flatfield fetch level-0 image remove overscan rows & cols correct for gain & offset sdo.fds_orbit_vectors sdo.lev0_asd_003 ID bad pixels calculate image center set quality hmi.lev0 ancillary-data input ancillary-data processing image-data input image-data processing hmi.lev1 • COMPLETE as of July 2009
HMI Higher-Level Processing Status • Higher-Level Science Products • Internal Rotation Ω(r,Θ) [estimated complete as of August 2009] • Internal Sound Speed cs(r,Θ) [estimated complete as of August 2009] • Full-Disk Velocity v(r,Θ,Φ) [estimated complete as of December 2009] • Sound Speed cs(r,Θ,Φ) [estimated complete as of December 2009] • Carrington Synoptic Velocity Maps [estimated complete as of December 2009] • Carrington Synoptic Speed Maps [estimated complete as of December 2009] • High-Resolution Velocity Maps [estimated complete as of December 2009] • High-Resolution Speed Maps [estimated complete as of December 2009] • Deep Focus Maps [estimated complete as of July 2010] • Far-Side Activity Maps [estimated complete as of December 2009] • Line-of-Sight Magnetic Field Maps [COMPLETE as of July 2009] • Vector field inversion and direction disambiguation [estimated complete as of March 2010] • Vector Magnetic Field Maps [estimated complete as of April 2010] • Coronal Magnetic Field Extrapolations [COMPLETE as of July 2009] • Coronal and Solar Wind Models [estimated complete as of April 2010] • Brightness Images [estimated complete as of August 2009]
AVC Processing Status Event Detection System Base system [COMPLETE as of August 2008] System test with SAO/FFT module [COMPLETE as of July 2009] Full system with initial modules [December 2009] Inspection & Display system Production hardware installation [COMPLETE as of July 2009] Control software integration [COMPLETE by August 2009] Display software [COMPLETE by August 2009] Heliophysics Event Knowledgebase Registries Final server installed [COMPLETE May 2009] Existing database migration underway [COMPLETE August 2009] Add data-order tracking to Heliophysics Coverage Registry [COMPLETE July 2009] Web Services & clients Publish API for Helioviewer and other 3rd party clients [COMPLETE May 2009] SolarSoft interface [December 2009] Higher Level Products Space Weather/Quick-look products definition [COMPLETE as of June 2009] Evaluation of JPEG2000 format for Browse products [COMPLETE as of June 2009]
Data Distribution and Export • Scope • AIA – Level-0 and Level-1 data • HMI – Level-0 data through Level-2 data • Web Export • http://jsoc.stanford.edu/ajax/lookdata.html • Query for desired data, then download via web • Supports several data formats (internal files, FITS files, tar files, compressed files) • Provides support for special processing (such as extracting regions) • Other developers can expand on this export method by writing javascript that is allowed to access our web cgi programs • Functional now; enhancements estimated complete as of August 2009 • NetDRMS • Network of DRMS sites • Can share DRMS data (not just data files) among sites using RemoteDRMS and RemoteSUMS • Scientists can request the same data from one of many sites • Functional now; enhancements estimated complete as of August 2009 • Virtual Solar Observatory (VSO) Integration • Provides UI that allows uniform search of disparate types of data • Obtains metadata and data files from NetDRMS sites experiencing the least congestion • Estimated complete as of December 2009
Maintenance and Expansion During Mission • Hardware • Each hardware component is covered under a vendor service plan; as plans expire, they are renewed • Planned phased replacement/upgrades throughout Phase-E • Software • Lead software developers are part of continuing team for Phase-E • Storage • File Server – 150 TB per year disk • Tape Library – filled tapes stored in Data Center, replaced with new tapes as needed; library expansion entails new 1300-slot cabinet when needed • Functionality • Anticipate continued development of science processing and distribution tools during Phase-E
Configuration Management • Software Version Control System with CVS • All software controlled in CVS; development occurs in “sandboxes” • Daily Build • All binaries are compiled daily from CVS repository • JSOC Software Release • Every month stable binaries are compiled in /home/jsoc; production processes must use these binaries • Change Control Board (CCB) • Data Capture July 2009 • DRMS/SUMS September 2009 • Pipeline Dataflow Management System December 2009 • Pipeline Control Tables December 2009 • Level-0 Processing July 2009 • Level-1 Processing September 2009 • Science Analysis Pipeline Modules no CCB • Code Freeze • Data Capture September 2009
Failure Recovery • Data Capture • OC3-line failure – there are two lines; use the second one for both AIA and HMI • DCS-machine failure – use spare in Data Center (assumes failed DCS machine’s IP address); spare has database backup • To mitigate media failure, there are two tape copies of tlm data, one at Stanford and one at LMSAL, and Level-0 data are archived to tape; thus there are three copies of the images. • Catastrophic Data Capture failure exceeding 20 days – use spare DCS at MOC • Software logging reports anomalies • Level-0 and Level-1 • If a dedicated cluster node fails, dedicate another • Software logging, email notifications, and monitoring web pages report anomalies • Processing Hardware • Cluster contains redundant nodes • Disk systems are RAID-6 with checksums; onsite spare drives exist • Onsite spare tape-library robotic components • Data-processing software modules log errors with recovery information • Databases • A secondary database machine mirrors primary machine’s databases • Public accessible third copy of database isolates production system
Documentation • Wiki • http://jsoc.stanford.edu/jsocwiki • DRMS/SUMS overview and description • DRMS and SUMS users’ guide and developers’ guide • Release notes • Doxygen • Manual describing DRMS API functions and modules • Provides synopsis and describes input parameters, output, return values • To date, ~ 1/2 of functions/modules have documentation • Flow Diagrams • Tree diagrams illustrating connections between various programs, data series, data tables, etc. • Diagrammatic view of pipeline processes links to documentation • Note stages of development (A – E) and estimated completion date • Procedures documented • Database maintenance • DCS operations • Level-0 processing management • RemoteDRMS/SUMS installation and maintenance • Procedure documentation in progress • Calibration Processing (filters, flat fields, etc.) • Pipeline Dataflow Management System • Export management • Weekly data product report generation
Staffing • The JSOC-SDP is managed and executed solely at Stanford University • Staff hours conform to a normal 5-day schedule • The work hours of staff whose duties includes dataflow management are staggered such that dataflow is monitored nearly 12 hours each day • The AVC component of the JSOC-SDP is managed at LMSAL
Stanford Phase-E Staffing • Science Data Processing Teams • Level-1.5 Team • Jesper Schou, Lead • Sebastien Couvidat • Cristina Rabello-Soares • Richard Wachter • Yang Liu • Steve Tomczyk, (HAO group lead) • Level-2 Science Products • Tom Duvall (GSFC) • Keiji Hayashi • Todd Hoeksema • Sasha Kosovichev • Konstantin Parchevsky • Junwei Zhao • Xuepu Zhao JSOC SDP Team • HMI/JSOC-SDP • PI, Phil Scherrer • Program Manager, Rock Bush • SDP Software • Art Amezcua, Lead • Jim Aloise • Rick Bogart • Jennifer Spencer • SDP Hardware • Keh-Cheng Chu, Lead • Brian Roberts, Sys Admin • Data Operations • Jeneen Sommers • Hao Thai • Admin • Romeo Durscher • Kim Ross Note: Many have multiple roles, not shown. Does not include complete SU HMI science team.
LMSAL Phase-E Staffing JSOC AVC Team AIA/JSOC-AVC Alan Title, PI Karel Schijver, Science Lead Neal Hurlburt, Data Lead HEK/ AVC Software Scott Green Sam Freeland David Schiff Ralph Seguin Ankur Somani Ryan Timmons AVC Hardware Chris Heck, Sys Admin Data Operations John Serafin • Science Data Processing Teams • Level-1.5 Team • Paul Boerner • Rich Nightingale • Ted Tarbell • Jim Lemen • Paolo Grigis (SAO) • Jean-Pierre Wuelser • Level-2 Science Products • Marc DeRosa, Lead • Mark Cheung • Mark Weber (SAO) • Todd Hoeksema (HMI) Note: Many have multiple roles, not shown. Does not include complete LM AIA science team.
Verification • Science Data Products Accounting • Weekly reports provided to SDO Project Scientist showing data coverage as percent of each day with good data for HMI Dopplergrams, LOS Magnetograms, and Averaged Vector Magnetograms • Missing data categorized according to reason: Calibration, Eclipse, or Unplanned • MRD • 5.4.2 – Archive • Archive system complete and verified by extensive use over the last year • 5.4.3 – Level-1 Data Products • Will be verified during commissioning • 5.4.4 – Science Data Acceptance • Flowed from GSFC to Data Capture System (DCS) during testing • Verified DCS function and Level-0 Processing
Summary • The JSOC-SDP can support: • Archive and distribution functions now • Analysis for instrument commissioning now • Initial science data processing by launch So lets get SDO in the sky so we can have real data!
JSOC-SDP Stages of Software Development • Stage A – Code specification exists, but working code does not exist • Stage B – Prototype code exists, but not necessarily on HMI data and not necessarily in the correct language • Stage C – Working code exists but cannot run inside JSOC pipeline • Stage D – Working code capable of running in JSOC pipeline, but undergoing final testing and not released for general use • Stage E – Working code complete and integrated into JSOC pipeline Following dataflow charts show status as of FORR with estimated months to complete to stage E shown
AVC Dataflow – Data Distribution Minutes Days Months