1 / 8

ES Slowdown, Optimization, Testing

ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline. April: Focus on resolution of major outstanding issues:

carver
Download Presentation

ES Slowdown, Optimization, Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ES Slowdown, Optimization, Testing

  2. Plan for shutdown: Timeline • April: Focus on resolution of major outstanding issues: • Bulk data deployment stable use of multiple arrays: Status nominally allows effective use of multiple arrays but not sure if issues with high rate of execution failure (5 of 15-20) is related • Notification channel issues: often results in long delays at start up, missing data at times (wvr/tsys) (Problem comes and goes) • Data capture issues: Issues with container crashes resulting in lost data. Limits maximum execution time. Status: Fixes in, requires testing/verification. • “Handover” time has grown: Systems not coming up, is a combination of hardware and software. Tzu/Nick/Emilio working on this. • May: • Start acceptance testing • Focus on simulation and testing improvements begins • Potential missions to work on remaining issues if needed • Focus of debugging moves from issues affecting Cycle 1 to issues affecting CSV scope of testing for future cycles.

  3. Plan for shutdown: planned missions • Bulk data: Completed mission. Issues still remain. CSV is using a “stable” which has troubles (typically about 5-7 executions a night fail for reasons that look like timeout issues) • Notification channel mission: Unfortunately marginalized by power shutdown • Data capture mission after this meeting

  4. Plan for shutdown: Obsmode Suite • Test suite for basic, science-like executions. • Tests SSR/SOS functionality, data recording, range of Cycle 1 capabilities • Completed late March: taking awhile to get repeatedly good datasets • Despite this reasonable data now getting to the pipeline testers (SACM group). • This will become SB execution regression for Cycle 1. Will be extended for Cycle 2 capabilities as they come/are verified (already have polarization “science-like” SB that will evolve this way) • All reduction intended to be done with the Pipeline. If it can’t be, it will advise the reduction of new modes.

  5. Plan for shutdown: Software basic • Minimal set of tests to run at weekly regression to verify functionality • Will likely consist of Total Power, Autocorrelation, ACA+BL correlator run at the same time (4-5 executions). • Total power raster • Auto correlation raster (to be combined with above when dual mode works) • ACA+BL executions of PNT, SBR, Tsys, Bandpass, PNT, Tsys, Bandpass • These are nominally not directly reducible from the pipeline. • Initial set defined, need to iterate with ADC on the details. • Initial proposal was sent to ADC, has evolved a bit. • Tool kit being developed based on MS side. Metrics will include things like “detectDelayJump(threshold,timescale)”, “detectPlatforming(threshold)”, etc. • Also using scan, spw, data size metrics to make sure everything that should be there is there. • Check flagging fraction • It is assumed that this is throw away code that will be implemented as metrics in the pipeline eventually. Discuss timeline? • Intended that basic executions are not pipeline reducible (too much overhead for weekly regression) • Idea is for computing to run, science to provide pass/fail criteria • Contributions likely from CSV, DSO, ARCs and spans SACM, DMG and Pipeline related staff. • Deadline for design and execution blocks: April 30 • Deadline for toolkit TBD (in progress, will likely evolve)

  6. Plan for shutdown: Plan for intensive • Designed to catch issues often present in major releases • Again, design tools that can eventually put into Pipeline when time allows • All of these need SSR work to make things automatic in terms of source selection for “science target” as well as calibrators. Special execution scripts are not needed. • Designed to use Pipeline: (initial reduction done in Pipeline, all tool creation is in progress to be absorbed into Pipeline eventually) • Frequency labels (SB created) • Phase transfer, Phase/delay jump (mixed mode, SB created) • Return to phase/delay after band change (SB to be made by end of April) • TDM phase/delay jump and platforming detection (SB to be created by end of April, fast dumps) • Scan sequence stresses/latency check (SB to be created by end of April) • Not to be reduced at least to first order in Pipeline • Verify execution of all CalTargets and results which are repeatable • Includes data checks to “applied online” as well as “reduced offline” targets • Intensive suite will incorporate new capabilities as they come forward with the goal of not introducing new tests but incorporating new features into the old tests (not a new idea…)

  7. Plan for next year: SSR/SOS and unit tests • SSR/SOS review completed Monday/Tuesday • High priority placed on query interface refactor: • Would like to eventually migrate things into the calibrator catalog interface but will design to ease this at a later date • Target based queries go into the target • High priority placed on merging observing mode functions that are in the SSR/SOS side to Control when needed, make SSR/SOS obsmode inherit from Control, not other way around (don’t ask…) • Development of Sessions, Observatory Calibration Scripts and new modes will add a layer of ObservingStrategy. • Timeline for this full refactor is ~1 year given manpower and need to develop some new functionality on our side. • Development will be done in parallel branches with refactor worked on in one branch and separable new capabilities in another • ScanLists will manage logic of execution breaks (currently the ScanList is a dumb handler) • Unit tests will be updated as time allows • Development/refactor assignments: N Phillips (SIST) observatory calibration scripts; P Cortes (DSO) sessions and observing strategy rework; Ignacio Toledo (DSO-DA) query refactor; S Corder (CSV) ScanListintelligence design assignments as possible to other groups (this item is completely dependent on refactor, not on critical path for Cycle 2).

  8. Optimization/Coordination • Who will do which work? During what array time? • What is the timescale for getting performance metrics into the pipeline? • Has CSV left anything out to help provide a long term viable observatory operational model (>3 years)? • Are the divisions of the testing suites appropriate/complete? • What is the level of support that can be provided with the refactor/unit tests? • What is the model for getting more coordinated and complete testing into the lower level? • Can we test with a more realistic simulation environment? (Better testing of interactions?) • Can we test with better scalability considerations?

More Related