210 likes | 328 Views
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013. ICT Group Planning: Control Rafael Hiriart ICT Control Group Lead. Status since the last coordination meeting, Santiago 2012. What we have delivered, what is pending. Features & issues.
E N D
ALMA Integrated Computing TeamCoordination & Planning Meeting #1Santiago, 17-19 April 2013 ICT Group Planning: Control Rafael Hiriart ICT Control Group Lead
Status since the last coordination meeting, Santiago 2012. What we have delivered, what is pending. Features & issues. Tentative schedule. 3 month “recession” period. After recession. Outline
Control Group Resources • Ralph Marson • Jorge Avarias (Control/Scheduling) • New hire, that was going to start next week, backed out yesterday. • => Software Engineer II position open • Rodrigo Amestica • Jesus Perez • Matias Mora • Rafael Hiriart • Open Position: Software Engineer III – General Control Developer • Open Position: Software Engineer II – Control/ObOps Developer • 5.5 FTEs currently, 8 FTEs when complete • Currently there is really 1.5 FTEs in Control and 2.5 FTEs in Correlator for development and debugging work
Control Group “Components” • Control (or Antenna, Tuning & Timing) • Includes Monitoring only up to the BACI property interface. • BL Correlator • H/W Configuration Database (a.k.a. TMCDB) • S/W side is ACS, including S/W tmcdb-explorer plug-ins. • Metadata Capture (a.k.a. DataCapturer) • QuickLook • Includes processing and visualization of TelCal results. • It does not include the Real Time Filler, but for a couple of calls to activate it and deactivate it. Currently it is completely deactivated, because it was suspected to cause bulkdata problems. What is the plan in this regard? • ALMA Phasing Project
Deliveries since 2012 Coordination Meeting • Observation efficiency was identified as the highest priority. • Substantial improvements in inter-subscan interval (1.5 seconds, can be made shorter with additional improvements in the correlator). • Scan sequences allow similar timings. • Management of LS slave lasers delivered, some HW issues being investigated by BackEnd. • FrontEnd optimizations being followed up by ADC and FrontEnd IPT. • Integration of TelCal calibration results into scan sequences pending. • Monitoring enhancements delivered (batch 1 and batch 2). • We delivered what CIPT had agreed to deliver according to “Analysis of collated…”. Pending acceptance. Progress in recent meeting between Arturo and Maurizio. • DataCapturer incremental writes under testing • An intermediate solution for memory issues is ready for Phase V in 9.1.4. • A prototype for a more general solution is well advanced. This will allow to improve QuickLook as well. • Additional ASDM tables were populated, but more are coming (Ephemeris being the most important).
(continuation) • 4 quadrant correlator delivered, including porting the CDP Master to 64-bit Linux. • More of a “capabilitiy” than a simple feature. • Extensive refactoring in the threading structure of the correlator, • Which allowed to overlap subscans in the correlator, improving efficiency. • Paved the way to achieve 100% data rate. • QuickLook summaries and “alarm” reports delivered and integrated with AQUA. • Several TMCDB improvements, including adding versioning to several tables, FLOOG and DTX offsets, and support for “global” configurations. • More requirements have been requested. • 90 degrees Walsh functions were delivered, but this feature hasn’t been sufficiently tested yet. • Many bugs fixed. In general, tracking and fixing problems is taking more than 50% of our time. Nor surprising given reduced FTEs. • All Cycle 1 critical bugs (1-2 priority in Stuartt’s document) have been addressed.
Planned features that haven’t been delivered yet • Alarms. Configuration issues were taken over by ADC. • Some progress but further improvements currently stopped on waiting for an update in valid range limits from BackEnd (Rodrigo Brito?). • Still pending is to devise a way of not sending some alarms during inter-subscan intervals (WCA not locked, for example). • Issue with resetting alarms when components go up or down. Components need to reset alarm state for each BACI one by one. • These two last issues need more discussion with ACS. • Pending correlator modes • 2x Nyquist, 3x3 and 4x4 modes. It has been de-scoped several times, it doesn’t seem to have a high relative priority with respect to other correlator features.
Current Planning Process • Feature delivery schedule is highly dynamic, plans change often according to updated priorities, including high priority problems. • The Control Software Coordination Group (CSCG) meeting is our main forum to discuss priorities and planning. • Currently we have representatives from Control (Rafael, Ralph and Rodrigo), ADC (Tzu and Ruben), SIST (Neil), CSV (Stuartt), and ADE (Nick). Missing is DSO, but we usually discuss planning with DSO either in the Scheduling or AQUA meeting. • CSCG is an effective and useful meeting. It allow us to understand what are the higher priorities for the project and adjust our plan accordingly. • As discussed yesterday, we'll include Manabu. • Requirement management process discussed yesterday. • Ticket priorities, and changes in development plan will be discussed in this meeting.
Important Features – Control • Improve Correlator Calibrations Management • Fix bug on Scan Sequences, but beyond this, it is necessary to investigate under what conditions do we need to calibrate the correlator (e.g. what is the necessary input power range, etc.) • Fast Switching • For the LS, currently the ball is in BackEnd side, but after this is done, we may need to follow up making adjustments in the software. • In the FrontEnd, algorithmic optimizations are being investigated, we will also need to parallelize some operations. • Observing Mode level software is complete. • Artificial Source • Current plans? • Scan Sequence II • Involves integrating management of the TelCal calibrations inside the Scan Sequences. • Needed for VLBI as well.
(continuation) • Fast Scanning • Lissajour patterns in the antenna motion. • Nutator Observing • Device driver delivered, includes only monitor/control points • A small issue with scaling factors for some monitor/control points, which at the time of development were unclear. Clarification request rejected (!???). • More issues with the meaning of dwell and transition periods. This is relevant for writing the observing mode. Clarification request also rejected. • In many cases, significant details necessary to write software has been specified in Operations Manuals (e.g., Backend sub-device initialization order, LS algorithms). This time we are asking for the Nutator Operations manual. • Feature development pending on acceptance. • Will require binning in the ACA correlator, not in the BL correlator. Is this still true? • Dynamic Sub-arrays • Frequency Switching • Get rid of CASA dependencies • Request from CASA, to avoid having to maintain a build for ALMA online SW.
(continuation) • Sideband separation in autocorrelations by frequency stepping. • Not sure if this is still a requirement. • Blanking. • DRX crossbar control. • Flag by Antenna shadowing. • Incorporate Doppler correction on ALMA-OT validation of frequency tuning. • Extend IF frequency from 4-8 GHz to 4-9 GHz to avoid (artificially) un-tunable gaps. • We requested that FrontEnd specifications were updated. FrontEnd said no. Is this still a requirement? • Introduce sub-reflector movements in delay calculations. • Improvements in Array and Antenna OMC plugins. • Documentation (e.g., delay tracking / fringe rotation).
Monitoring Enhancements • Many features delivered, but not yet accepted. • Path forward discussed yesterday. • An important feature that was delivered is the ability to archive all monitor points. • Missing types were implemented. • Monitoring configuration is now only a matter of adjusting TMCDB records. • “Composite” properties are being separated in the database. • Archive on-change (a pure ACS feature) is key to reduce monitoring rates.
BL Correlator • Subarrays • It will involve work both in Control and in Correlator • Design is being discussed. The main problem is how to synchronize the timing so there’s no conflicts in the quadrants CAN bus. • Execution in the first subarray will be optimal. A subsequent execution in a second subarray will be sub-optimal, as it will need to be scheduled in such a way to avoid already used CAN “slots”. • Full Correlator Data Rate (60 MB/s) • Rodrigo working on this. Interrupted to support bulkdata issues investigation. • Manage more than 16 configuration and calibration slots. • J working on this. Interrupted to debug several issues. • Correlator Pending Modes • FAST mode (1 ms auto-correlations) already implemented, it needs verification. • Correlator Multi-Resolution Modes • LO offsetting sideband “separation” • Correlator hardware timestamping • 3x3 bit quantization correction in FDM
H/W Configuration Database • Improvements in utilities to better track Antenna status (high priority, included in “recession” period features). • Complete Antenna CAI map. • ACA specific pad delays. • Track antenna metrology mode. • Complete history of BaseElements and Assemblies. • MonitoringUpdate tool. Mass update of monitoring properties. • Access to Antenna pad position history (needed for pipeline). • Extend versioning to other tables. • Assembly tables • There are other tables in the S/W side (Component, BACI property), but this is outside our scope. Is ACS considering this? • Global configuration delivered, but not support for S/W side. Is this an ACS requirement? • Do we need to integrate the TMCDB with the Antenna Status Dashboard?
DataCapturer • Full incremental writes well advanced. • The ASDM will be written to a relational database as the observation is performed. • Discussed with DSO, ARCHIVE, OBOPS, OFFLINE. Everybody think it’s a good idea. • It will make DataCapturer more robust (no longer holding everything in memory, transaction support, etc.), it will allow to improve QuickLook and other applications to query metadata without having to wait for the observation to finish (e.g. some of the ERMA functionality could be integrated). • New tables.
QuickLook • QuickLook functionalities: • Plot TelCal results, agregating and computing simple statistics from calibration tables. • Construct AQUA summaries. • Manage the RTF. • Currently deactivated. We should probably move this out of here. • It’s a model application for a three-tier system (data, logic, presentation). • DataCapturer incremental writes enable this. Most of the application code can be replaced by queries to the relational database (using GROUP BY, etc.) • QuickLook can now have “memory”. It can go back and plot past results. This is not currently supported. • We can get rid of a complicated interaction between Control, DataCapturer, TelCal and QuickLook. If fact, the plugins can simply be notified of new results at the end of a subscan and query the database themselves. We could get rid of obscures freezes. • External libraries that could further simplify and enhance QuickLook capabilities are being evaluated (the R statistical package, for instance).
ALMA Phasing Project (VLBI) • Quite detailed design and planning documents are being completed for the CDR (~100 pages, last time I looked). • New ASDM tables included in the Appendix. • Development will begin after the CDR. We estimate 1.5 FTE of development effort. • Richard Hills sent around a memo describing the consequences of the IF residual fringe in the phasing. We are studying possible alternatives. • Richard also had asked who will assume the “system engineering” aspects of the project for the ALMA side. Who should we talk to about these issues? • A complete test environment has been put together in Charlottesville. This is a full STE, with additional node machines. It will allow us to integrate all the components involved in the phasing loop (Correlator, TelCal and Control), and test agains the hardware being developed by the correlator H/W team (2 antenna correlator rack).
“Recession” Period Features • Startup enhancements (Tzu is organizing the requirements, from Emilio’s email). • We should concentrate on automate the startup procedure as much as possible, and improve the way real problems are reported to the operators. • Merge and test DataCapturer Pointing table incremental write from 9.1.4. • TMCDB improvements to facilitate tracking Antenna status. • Array and Master should be improved to avoid FSR. • Destroy array should always work. • Reinitialize Antennas should be extended to Central LO, AOS Timing, etc. • Add the capability to update the Master Antenna list (I believe this was part of Tzu's Antenna plug-and-play project).
Testing Improvements • Scalability test for DataCapturer. • Define more granular tests for OSF regression tests. • Correlator simulator improvements. • DataCapturer scalability tests. • Total Power processor tests. • Integrate correlator in Control's main integration test.
After recession schedule Ralph Calibration ID management Fast Scanning or Nutator (July) Fast switching (as needed) Artificial Source (October) Scan Sequence II (Jan. 2014) Rodrigo 100% data rates (~1-2 month development left) LO offsetting sideband separation (July) Pending modes (September) Multi-resolution modes (Jan. 2014)
(continuation) J Manage more than 16 configurations (~1 month left) Subarrays (October) Matias: Correlator simulator (improvements during recession period) APP Subarrays