1 / 44

Computing

Computing. B.E. Glendenning (NRAO). Outline. Context Management PDR results Software Overview / Architecture Science testing plans AIPS++ Pipeline. ALMA Context. Timeline: Interim operations ~2007.5, Regular operations 2012.0 Computing is one of 9 Integrated Product Teams

roman
Download Presentation

Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing B.E. Glendenning (NRAO) ANASAC, 2003-08-25, Chicago

  2. Outline • Context • Management • PDR results • Software Overview / Architecture • Science testing plans • AIPS++ • Pipeline ANASAC, 2003-08-25, Chicago

  3. ALMA Context • Timeline: Interim operations ~2007.5, Regular operations 2012.0 • Computing is one of 9 Integrated Product Teams • $35M / $552M (Computing/Total) = 6.3% • FTE-y ratio = 230/1515 = 15% ANASAC, 2003-08-25, Chicago

  4. Scope • Computing: • Software development • Necessary operational computer equipment • System/network administration (operations phases) • Subsystems: proposal preparation, monitoring, dynamic scheduling, equipment control and calibration, correlator control and processing, archiving, automated pipeline processing, offline processing • Not: embedded software (hardware groups), algorithm development (Science) • Activities: management, requirements, analysis, common software, software engineering, integration and test ANASAC, 2003-08-25, Chicago

  5. Scope (2) • Scope: Observing preparation and support through archival access to automatically produced (pipeline) images ANASAC, 2003-08-25, Chicago

  6. Development Groups • 50%/50% Europe/North America • 10 Institutes, 13 Sites! Range: HIA/Victoria = 0.5FTE, NRAO/AOC = 15 • Communications difficult, more formal processes + Access to world expertise ANASAC, 2003-08-25, Chicago

  7. Buy vs. Build • Want to be parsimonious with development funds • Want system sufficiently “modern” that it will suffice for the construction period and some time thereafter (CORBA, XML, Java, C++, Python) • Many open tools/frameworks (ACE/TAO, omniOrb, GNU tools, etc) • After search, ALMA Common Software (ACS) code base adopted from accelerator community • Simplified CORBA framework, useful services • Astronomy-domain adopted code bases • Calibration, imaging: AIPS++ • Archive: NGAST • Atmosphere: ATM ANASAC, 2003-08-25, Chicago

  8. Strategies • ACS • provide a common technical way of working • Continuous scientific input through Subsystem Scientists • Synchronized 6-month releases • A common pace for the project • Check requirements completion • Yearly design/planning reviews • React to surprises • Retain some construction budget to allow for discoveries made during interim operations period ANASAC, 2003-08-25, Chicago

  9. Management Planning • Computing Plan (framework) and 15 subsystem agreements prepared • Computing Management Plan approved by JAO • Subsystem agreements approved by Computing IPT • Management model: • Agreement “contracts” followed by subsystem scientists (scope) and software management (schedule, cost) • Initial PDR, yearly CDRs and release planning • Synchronized 6-month releases across all subsystems • SSR requirements mapped to releases for progress tracking ANASAC, 2003-08-25, Chicago

  10. Package Feature name Short description SSR requirement Numbers To be completed at Release (R1.0-3.0) Status at Milestone T0 (N,P,C) ObsProject CheckObsDatabase Check observation database for conflict 3.0R10 R3.0 N ObsProject Local save Save/recover programs to/from local disk in human readable form 3.0-R11, 3.1R3, 3.1R12 R1.0 N Management Planning (2) • Readiness reviews, acceptance tests, and delivery to interim operations in 2007 • “Data flow” items get a second development period during interim operations • review requirements after first experience with operations • Not subject to current agreements – institutional responsibility could shift ANASAC, 2003-08-25, Chicago

  11. Preliminary Design Review • March 18-20, Tucson • Panel • R. Doxsey (CHAIR) (STScI, Head HST Mission Office) • P. Quinn (ESO, Head of Data Management and Operations Division) • N. Radziwill (NRAO, Head of GBT Software Development) • J. Richer (Cambridge, ALMA/UK Project Scientist, Science Advisory Committee) • D. Sramek (NRAO, ALMA Systems Engineering IPT Lead) • S. Wampler (NSO, Software Architect) • A. Wootten (NRAO, ALMA/North America Project Scientist)  ANASAC, 2003-08-25, Chicago

  12. Preliminary Design Review (2) • Well prepared PDR; at or ahead of where similar projects have been at this stage • The ALMA Project needs to develop an understanding of operations, and the Computing IPT needs to fold this in to their planning and priorities. This might result in reprioritization of SSR requirements • ALMA project needs to define clearly the steps necessary for getting to, and implementing, Interim Operations and the consequent requirements for computing support ANASAC, 2003-08-25, Chicago

  13. Preliminary Design Review (3) • The interaction with AIPS++ requires careful management. Operational (Pipeline) requirements on AIPS++ need further development. • Testing effort seems satisfactory; management will need to follow-up to ensure subsystem unit tests are in fact carried out ANASAC, 2003-08-25, Chicago

  14. Software Scope • From the cradle… • Proposal Preparation • Proposal Review • Program Preparation • Dynamic Scheduling of Programs • Observation • Calibration & Imaging • Data Delivery & Archiving • Afterlife: • Archival Research & VO Compliance ANASAC, 2003-08-25, Chicago

  15. And it has to look easy… • “1.0-R1 The ALMA software shall offer an easy to use interface to any user and should not assume detailed knowledge of millimeter astronomy and of the ALMA hardware. • “1.0-R4 The general user shall be offered fully supported, standard observing modes to achieve the project goals, expressed in terms of science parameters rather than technical quantities. Observing modes shall allow automatic fine tuning of observing parameters to adapt to small changes in observing conditions.” • Which means that what is simple for the user will becomplexfor the software developer. • Architecture should relieve developer of unnecessary complexity • Separation of functional from technical concerns • But the expert must be able to exercise full control ANASAC, 2003-08-25, Chicago

  16. Observatory tasks • Administration of projects • Monitoring and quality control • Scheduling of maintenance • Scheduling of personnel • Security and access control ANASAC, 2003-08-25, Chicago

  17. The numbers • Average/peak data rates of 6/60 Mbyte/s • Raw (uv) data ~ ⅔, image data ~ ⅓ of the total • Assumes • Baseline correlator, 10 s integration time, 5000 channels • Can tradeoff integration time vs. channels • Implies ~ 180 Tbyte/y to archive • Archive access rates could be ~5 higher (cf. HST) • Feedback from calibration to operations • ~ 0.5 s from observation to result (pointing, focus, phase noise) • Science data processing must keep pace (on average) with data acquisition ANASAC, 2003-08-25, Chicago

  18. Meta-Requirements • “Standard Observing Modes” won’t be standard for a long time • e.g., OTF mosaics, phase calibration at submm λ • Instrument likely to change & grow • Atacama Compact Array (ACA) • Second-generation correlator • Could drastically increase data rate (possible even with baseline correlator, might be demanded for OTF mosaics) • Computer hardware will continue to evolve • Development spread across ≥ 2 continents & cultures ANASAC, 2003-08-25, Chicago

  19. What do these requirements imply for the architecture of the software? • Must facilitate development of new observing modes (learning by doing) • Must allow scaling to new hardware, higher data rates • Must enable distributed development • Modular • Standard • Encourage doing the same thing either a) in the same way everywhere; or b) only once. ANASAC, 2003-08-25, Chicago

  20. ANASAC, 2003-08-25, Chicago

  21. Functional Aspects • Executive, Archive, ACS are global: all other subsystems interact with them. • ACS: common software – a foundational role • Executive: start, stop, monitor – an oversight role • Archive: object persistence, configuration data, long-term science data – a structural support role • Instrument operations • Control, correlator, quick-look pipeline, calibration pipeline – real-time • Scheduling (near real-time) • “External” subsystems • Observation preparation and planning • Science data reduction pipeline (not necessarily “online”) ANASAC, 2003-08-25, Chicago

  22. ANASAC, 2003-08-25, Chicago

  23. Scheduling Block Execution ANASAC, 2003-08-25, Chicago

  24. ALMA Software User Test Plan: Status • Software test plans being developed by SSR subsystem scientists and subsystem leads. • Test plan components: • Use Cases – descriptions of operational modes and what external dependencies exist. Designed to exercise subsystem interfaces, functionality, & user interfaces. • Test Cases – Use Case subset designed to test specific functions. • Testing timeline (when tests run in relation to Releases, CDRs). • Test Definitions - specifies which test case will be run, what the test focus is, and whether the test is automated or involves users. • Test Reports (e.g., user reports, audit updates, summary). • Test Plan drafts for all subsystems to be completed by Oct 1. ANASAC, 2003-08-25, Chicago

  25. ALMA Software User Test Plan: Status • Software test plan guidelines • June 5, 2003: Test Plan approved by Comp mgt & leads. • June 11, 2003: Test Plan presented to SSR. (ALMA sitescape, SSR draft documents). • Use Case development: • July 9, 2003: Use Case guidelines, html templates, & examples put on the web (www.aoc.nrao.edu/~dshepher/alma/usecases) & presented to SSR. • Aug 2003: Detailed Use Cases being written for all Subsystems. • Test Plan development: • Sept 2003: Draft test plans to be completed. • Oct 1, 2003: Test plans problems identified/reconciled, resources identified • Nov 2003: First User test scheduled (for Observing Tool subsystem). ANASAC, 2003-08-25, Chicago

  26. ALMA Software User Test Plan Status • Use Cases written to-date: • Observing Preparation Subsystem: • OT.UC.SingleFieldSetup.html Single Field, Single Line setup • OT.UC.MultiFieldMosaicSetup.html Multi-Field Mosaic setup • OT.UC.SurveyFewRegionsSeveralObjects.html Set Up to do a Survey of a Few Regions with Several Objects. • OT.UC.SpectralSurvey.html Set Up to do a Spectral Line Survey. • Control Subsystem: • Control.UC.automatic.html Automatic Operations Use Case. • Offline Subsystem: • Offline.UC.SnglFldReduce.html Reduce & Image Single Field Data • Offline.UC.MosaicReduce.html Reduce & Image Multi-Field Mosaic • Offline.UC.TotPower.html Reduce, Image Auto-Correlation Data • Pipeline Subsystem: • Pipeline.UC.ProcSciData.html Process Science Data • Pipeline.UC.SnglFld.html Science Pipeline: Process Single Field Data • Pipeline.UC.Mosaic.html Science Pipeline: Process Mosaic, no short spacings • Pipeline.QLDataProc.html Quick-Look Pipeline: Data Processing • Pipeline.QLCalMon.html Quick-Look Pipeline: Monitor Calibration Data • Pipeline.QLArrayMon.html Quick-Look Pipeline: Monitor Array Data • Scheduling Subsystem: • Sched.UC.dynamic.html Dynamic Mode (Automatic) Operations • Sched.UC.interactive.html Interactive Mode (Manual) Operations ANASAC, 2003-08-25, Chicago

  27. ANASAC, 2003-08-25, Chicago

  28. ANASAC, 2003-08-25, Chicago

  29. AIPS++ Evaluation • AIPS++ (along with the ESO Next Generation Archive System) is a major package used by ALMA • Both to ensure at least one complete data reduction package is available to users and in implementing ALMA systems (notably the Pipeline) • AIPS++ is a very controversial package (long development period, has not received wide acceptance) • ALMA Computing has arranged several evaluations • Audit of capabilities based on documentation • AIPS++/IRAM test to test suitability for millimeter data • Benchmarking tests • Technical review of AIPS++ March 5-7 2003 • Sound technical base, management changes needed ANASAC, 2003-08-25, Chicago

  30. AIPS++ Audit Explanatory – not results! Work to be done by ALMA These should be 0 (in ~2007) These should be <10% of the total ANASAC, 2003-08-25, Chicago

  31. ANASAC, 2003-08-25, Chicago

  32. AIPS++ Audit Results - Summary • All: 58% (Acceptable) / 16 % (Inadequate) / 16% (Unavailable) / 10% (TBD) • Critical 66% / 14% / 12% / 8% • Important 52% / 19% / 19% / 10% • Desirable 35% / 17% / 33% / 15% • 14% of requirements have had differing grades assigned by auditors ANASAC, 2003-08-25, Chicago

  33. AIPS++/IRAM Tests • Phase 1: Can AIPS++ Reduce real mm wave data? • Yes, but schedule was very extended • Partly underestimated effort, mostly priority setting • ALMA/NRAO and EVLA now directly manages AIPS++ • And for the next 12 months ALMA has complete control of priorities • Phase 2: Can new users process similar but new data? • Generally yes, but it is too hard • Phase 3: Performance (described next) ANASAC, 2003-08-25, Chicago

  34. AIPS++ Benchmark Status: • Four requirements related to AIPS++ performance: • 2.1.1 R4 – Performance of the Package shall be quantifiable and commensurate with data processing requirements of ALMA and scientific needs of users. Benchmarks shall be made for a fiducial set of reduction tasks on specified test data. • 2.2.2 R1.1 – GUI window updates shall be < 0.1s on same host. • 2.3.2 R4 – Package must be able to handle, efficiently & gracefully, datasets larger than main memory of host system. • 2.7.2 R3 – Display plot update speed shall not be a bottleneck. Speed shall be benchmarked and should be commensurate with comparable plotting packages. • ASAC: AIPS++ must be within factor of 2 of comparable pkgs. ANASAC, 2003-08-25, Chicago

  35. AIPS++ Benchmark Strategy • Finish AIPS++/IRAM Phase 3 (performance) test • Set up automated, web-accessible, performance regression tests of AIPS++ against AIPS, Gildas, and Miriad • Start simple, then extend to more complex data • Systematically work through performance problems in importance order • Resolution of some issues will require scientific input (e.g., when is an inexact polarization calculation OK) • Decide in Summer 2004 (CDR2) if AIPS++ performance issues have arisen from lack of attention or for fundamental technical reasons (“fatal flaw”) ANASAC, 2003-08-25, Chicago

  36. Full AIPS++/AIPS/Gildas/Miriad Comparison not possible • Different processing capabilities (polarization) and data formats • Standard ALMA-TI AIPS++ PdBI MIRIAD VLA Export • Package FITS FITS format format format format • GILDAS • MIRIAD • AIPS • AIPS++ • Compare AIPS++ with GILDAS on one dataset in ALMA-TI FITS format Compare AIPS++ with MIRIAD & AIPS on another dataset in FITS format ANASAC, 2003-08-25, Chicago

  37. AIPS++/IRAM Phase 3(ALMA-sized data, single field spectroscopic) • GILDAS/CLIC AIPS++ A/G Comments • Filler 1873 10939 5.8 • Init (write header info) 385 n/a • Fill model/corr data cols. 2140 n/a • PhCor (Check Ph-corr data) 889 3484 3.9 (AIPS++ Glish) • RF (Bandpass cal) 5572 2298 0.4 • Phase (Phase cal) 3164 1111 0.4 • Flux (Absolute flux cal) 1900 2093 1.2 (AIPS++ Glish) • Amp (Amplitude cal) 2242 614 0.3 • Table (Split out calib src data) 1200 5150 4.3 • Image 332 750 2.3 • Total 17600s 28600s 1.6 • Caveats: DRAFT results, bug in AIPS++ bandpass calibration requires too much memory (1.7GB AIPS++ vs. 1.0GB Gildas) • Gildas executables copied, not compiled an benchmark machine • Several AIPS++ values still be amenable to significant improvement ANASAC, 2003-08-25, Chicago

  38. AIPS++ Benchmark Status: • SSR has identified 2 initial benchmark datasets: • Pseudo GG Tau – PdBI data of 25 March. Original observation expanded to 64 antennas with GILDAS simulator & source structure converted to point source. 3 & 1 mm continuum & spectral line emission. Data in ALMA-TI FITS format (same data used during AIPS++ re-use Phase III test). • Ensure continuous comparisons in time with AIPS++ Ph III ‘re-use’ test • Compare core functions (fill, calibrate, image) on ALMA-size dataset • Exercise mm-specific processing steps • Polarized continuum data – VLA polarized continuum emission in grav lens 0957+561, 6cm continuum, 1 spectral window. Snapshot observation extended in time with AIPS++ simulator to increase run-time. Data in Std FITS format. • Exercise full polarization calibration, self-calibration, non-point source imaging (polarization processing can only be compared with MIRIAD/AIPS). Results to be published on web for each AIPS++ stable release. ANASAC, 2003-08-25, Chicago

  39. Calibrater Performance Improvements vs. Time ANASAC, 2003-08-25, Chicago

  40. Calibration Performance vs. AIPS ANASAC, 2003-08-25, Chicago

  41. Calibrater – Still TODO ANASAC, 2003-08-25, Chicago

  42. Imager Performance Imaging performance: Improved by factor of 1.8 for 2048 pixels. Improved by factor of 4.4 for 4096 pixels. AIPS++/AIPS ratio now 1.6 for 2048 pixels & 1.8 for 4096 pix. Now dominated by more general polarization processing in AIPS++? This is I Multi-polarization should be relatively faster in AIPS++, but needs to be demonstrated • Imaging Performance Improvements: Execution Tiime (sec) Image Size (NxN pixels) ANASAC, 2003-08-25, Chicago

  43. AIPS++ Benchmark Status: • Dataset expansion: • SSR will identify datasets in the following areas: • Spectral line, polarized emission. Multi-config dataset if possible • Multi-field interferometric mosaic • Large, simulated dataset, includes atmospheric opacity variations and phase noise • Single-dish + interferometer combination in uv plane (no SD reduction now (MIRIAD/AIPS do not process SD data, GILDAS only processes IRAM-format SD data & cannot convert to FITS). • NOTE: Glish-Based GUIs will be replaced with JAVA GUIs once ACS/Corba framework conversion is complete.  benchmark comparisons affecting GUI and plotting interface will be delayed until JAVA GUIs ready to test. ANASAC, 2003-08-25, Chicago

  44. Pipeline • Three current development tracks: • Paperwork: assemble use cases, write test plans, develop heuristics “decision trees” • Implement top-level interfaces required by the rest of the system (e.g., to start a “stub” pipeline when ordered to by the scheduling subsystem) • Technology development / prototype pipeline • VLA GRB observations • Bind AIPS++ “engines” to ALMA technology (Python, CORBA (i.e., ACS) • Gain experience for possible package-independent execution framework • To be used by at least AIPS++ • Allow pipeline computations to be performed by different packages • Possible relevance for VO ANASAC, 2003-08-25, Chicago

More Related