230 likes | 385 Views
DAQ Online Software Migration “December Status”. Chris Lester (and John Hill at short notice!). I would like to thank those at CERN, Birmingham, Oxford and Cambridge who have provided valuable assistance throughout the migration process.
E N D
DAQ Online Software Migration “December Status” Chris Lester (and John Hill at short notice!) I would like to thank those at CERN, Birmingham, Oxford and Cambridge who have provided valuable assistance throughout the migration process.
SctRodDaq 3.xx (current release) uses ATLAS online-00-19-01 To talk to DCS with PVSSII-v3 we need online-00-21-0x or better Is needed by SCT DCS people Is also needed to avoid SCT being “left behind” SctRodDaq 4.xx (head/development) began with online-00-21-02 but has since moved completely to online-00-22-00 in tandem with the testbeam. Reminder of what this is all about Christopher Lester
Initial Migration strategy • HEAD was frozen • SctRodDaq_4.xx (online21) developed in RED branch • SctRodDaq_3.xx (online19) developed in BLUE branch • Finally all changes from RED and BLUE merged into HEAD Result: • HEAD is SctRodDaq_4.xx • BLUE branch exists for SctRodDaq_3.xx support Christopher Lester
Migration since last SCT week • SctRodDaq_4.xx/HEAD updated to use online-00-22-00 in place of online-00-21-02. • On 11th November, there was a global merge from 3.xx to 4.xx of all 3.xx developments made up to that point (mostly feature extensions from Oxford). • Since 11th November, most development has been in 4.xx branch, with ports BACK to 3.xx only where features were needed at Oxford.
Other development since last SCT week. • At last SCT week, we reported on problems with the DAQDCS communication. • All these problems turned out to be caused by online-00-21-02 and were cured when we upgraded to online-00-22-00. • DAQDCS communications now seem OK and were tested at SR1 shortly after last SCT week. Christopher Lester
Other development since last SCT week. • Overhaul of exception handling in 4.xx branch. • For a long time there was a desperate need for an overhaul of the exception handling system in all branches. • Error conditions arise on one machine, but exception needs to be caught on another, or be seen by user. • Information (e.g. the cause of the exception) was being lost at more than one step between the point at which the error happened and the point at which that problem was finally seen by a user. • Complete overhaul was undertaken in 4.xx branch, and appears to be a success. • Changes have not been ported to 3.xx branch to provide carrot for migration at Oxford. Christopher Lester
Adoption at SR1 • Have already had talk from Dave Robinson • SR1 is now 100% behind 4.xx branch, and have abandoned use of 3.xx • SR1 main source of current testing of 4.xx outside Cambridge • SR1 currently running with 3-4 modules, and do not see any major problems! • SR1 do see minor problems: • Inconvenience: SctGui needs to be restarted after every Load+Config+Start+Stop+Unconfig+Unload cycle. • Work around exists – D.R. ? • Configuration IS server goes to sleep after above cycle. Is probable cause of above problem but is not really understood. • SR1 did see other minor problems which have now been fixed. E.g.: • Failure of run number to increment now traced to error in database. Thanks Mihai! Christopher Lester
Adoption at SR1 • Main message: • SctRodDaq_4.xx / HEAD works ! • But: • Awaiting stress test with more than 3-4 modules when Barrel Sector becomes available in early January 2005. Christopher Lester
Adoption at Oxford • For the purposes of migrating from 3.xx to 4.xx, Oxford were assigned a computer (“station 2”) that had been provisionally acquired for use on barrel 6. • Alan Barr installed full 4.xx online-00-22-00 set up on this machine, and only then discovered that: “this machine is a piece of junk” • Turned out that 255MHz pentium, 256Mb RAM not sufficient to even start the offline software, let alone run any scans. Alan “managed to press the start button”, though! Christopher Lester
Adoption at Oxford • Nevertheless: • There has been significant change of mood at Oxford: they are now very receptive to the benefits of migration. • It is expected that further progress will be made before the end of the year, with a possible change-over from 3.xx to 4.xx once barrel 3 is dispatched. Christopher Lester
Oxford now using 3.xx with largest number of modules ever (384 at last count) Oxford are beginning to see bottlenecks in SctRodDaq that have been known about for some time but are only now becoming limiting and thus a priority to sove! Example: Rate of raw data transfer limits time taken to do many scans In some cases system grinding to a halt under vast number of MRS messages being passed between processes. BOC scans now significantly slower than ordinary scans. Will need to solve these problems in 4.xx, so need to monitor progress at Oxford carefully. Non migration related news / comments: Christopher Lester
New RX Threshold scan was implemented in 4.xx and ported to 3.xx (see picture next page) Finds minimum RX Threshold at which module config register can be read without error. Finds maximum RX Threshold at which module config register can be read without error. Sets optimum RX Threshold half way between the two above. Heavily in use at Oxford. Awaiting graphical display of threshold values from D.Robinson (in GUI). New scans since last SCT week Christopher Lester
Max Best Min Max Best Min Rx Threshold Scan: Example Christopher Lester
Other new scans on the way • Mark space ratio • TX Dela Christopher Lester
Left to be done. • DAQ <- -> DCS communication • ddc_ct OK • ddc_dt dies • ddc_mt dies Christopher Lester
Remember this! • Code suitable for testing on modules exists! (Should probably make an actual release 4.00 to help external users) Christopher Lester
Status of migration at:http://www.hep.phy.cam.ac.uk/daq-bin/wiki.cgi/OmniMigration … however … Christopher Lester
Status of SctRodDaq Christopher Lester
Timescale and Testing • Time taken so far ~ 2 to 3 months. • Most effort in first 6 weeks • Why the slow down? • Major changes require serious testing • Difficult since Oxford now in serious macro assembly – main cause of slowing down • Few people seem desperate to test our code ! Christopher Lester
Testing cont … • Cambridge: • On three modules, 4.xx can do all that 3.01 can. Full characterisation sequence etc. • H8: • Limited testing of 4.xx on modules in testbeam – StrobeDelay etc. • Have some idea of way forward with TB integration • Oxford: • Only small amount of DDC testing. Christopher Lester
Status of SctRodDaq Release 4.xx ? (online 21) Release 3.xx (online 19) Expert tree surgery Harness Testing and Macro Assembly Release 2.xx (online 19) Christopher Lester
Notes: • 3.xx and 4.xx in danger of diverging • Both in terms of code and in testing. • Can make limited promise to merge further updates from 3.xx to 4.xx • but want to abandon 3.xx in the end! • We hope there should be very few (if any) changes for the user. • Need willing testers ! Christopher Lester