210 likes | 219 Views
This status report provides an overview of the Tev IPMs, their control from ACNET, the operational status of the vertical and horizontal systems, occasional data synchronization problems, and ongoing debugging efforts to address sync issues.
E N D
Status of Tev IPMsSeptember 3, 2008 Andreas Jansson
Status (short version) • The Tev IPMs can be controlled from and deliver data to ACNET in much the same way as the flying wires • The vertical system is operational since a long time. • Old version of the firmware/software. • Occasional data synchronization problem (not a problem for manual measurements) • The horizontal system is currently at FCC the help debugging sync issues. IPM status
ACNET control • Since some time the Tevatron IPMs are state driven in much the same way as flying wires. • Selected output is logged when data ready. IPM status
Tom’s ACNET interface IPM status
PSPEC settings IPM status
Datalogged averaged data PSPEC 5 June 19 During store IPM status
Sync issues • Spontaneous front end resets (early on) • Revert to default mode and loss of synchronization. • Due to Single Event Upsets (or just noise on slow control channel?) • Fixed with new front end firmware version plus Labview software changes • Trigger/Counter synchronization (affects both systems) • Links synchronized, but bunch data not in the correct time-bins. • Due to “missing” AA markers • Fixed by having both boards treat missing markers in same way. • Intra-board synchronization (affects both systems) • Loss of SERDES lock or illegal character on serial link. • Due to “unexplained” PCI accesses • Effect minimized by re-sync logic. Errors during acquisition flagged. • Inter-board synchronization (affects mainly horizontal) • Data from two buffer boards not aligned (varying from measurement to measurement) affecting bunch profiles and background subtraction. • Was hard to debug in the presence of other sync issues… • Still working on this one. Expect new firmware in about a week. IPM status
Status Long version
Example: Synchronized IPM status
Example: Unsynchronized Easy to detect and reject by eye for manual data-taking! IPM status
One manifestation of sync issues Histogram of proton counter in first buffer data frame. Expect single cluster of three bins around zero. Saw wider cluster + scatter + unexpected peaks IPM status
IPM timing Timing board Front-end boards Buffer boards LabView master TVBS APTVBS slave IPM status
System Debugging • FE board data streams have embedded timing and status information. • Due to limitations in the memory bandwidth, most error checking is done in buffer board firmware • Debugging mode writing raw data from some links available to debug front end boards. • Firmware quickly grew to become resource limited, making debugging iterations slow. • Painfully slow “place and routes” • Reduced version required to fit chipscope IPM status
Known buffer board issues • The buffer board was designed according to “best practices” for high speed signals • No hardware design issues found to date. • However, we discovered relatively early on that PCI access during an acquisition would sometimes glitch one or more of the high-speed data links. • Appears to be a FPGA internal problem • Work-around: re-order setup sequence to limit access to board, use interrupt instead of polling. • Seemed to work fine. IPM status
Board layout IPM status
Sync saga cont’d • Even without PCI access during acquisitions, loss of synchronization still occurred relatively frequently, particularly for injection measurements (long wait time) • Improved the initial synchronization logic • Added firmware logic to allow links to resynchronize dynamically if needed, while waiting for a trigger (In original implementation, the entire chain was synchronized once by a timing card reset just before an acquisition) IPM status
Sync saga cont’d • Early resync logic worked in test stand, but not in the Tevatron systems. • At this point, FY08 omnibus budget came out, furloughs started and we lost Kwame (our FE board engineer) to sunny Florida. • Rick (Buffer board engineer) took over debugging of both boards, and we moved the test stand to FCC (closer and real TVBS available). • Real TVBS available (used simulated before) IPM status
Sync saga cont’d • Found occasionally missing AA markers, and that FE board and buffer board treated this unspecified case differently. • Fixed with buffer firmware modification. • Also discovered “unexplained” PCI accesses during acquisition? • Don’t know where they come from, or how long they have been there (system patch?). • Tried to have board ignore PCI requests during acquisition, but this caused total PC lock-up. • With improved re-sync logic, error rate in test stand is now a few per mil. IPM status
Status • Intra-board sync fixed to the extent possible (residual rate is per mil level). • Last couple of “full firmware versions” tested OK in test stand but crashed the horizontal IPM host PC. • Hardware difference? • Moved horizontal systems to FCC for easier debugging. • Likely related to new inter-board sync logic that was not yet fully tested (In test stand only timeouts were observed). IPM status
Summary • Found and fixed the sync issue between the front-end & buffer board • Traced remaining sync problem to occasional non-Labview PCI access • With new (intra-board) re-sync logic, error rate is few per mil. Further reduction would require new board layout. • Investigating occasional timeout errors • Likely related to inter-board sync logic that was put in some time ago but never fully tested. • Seems benign in test stand, but causes the horizontal system to crash. IPM status
Forecast • With the intra-board sync issues fixed, work is proceeding on the inter-board sync logic • Expect to have a new firmware version ready for testing in about a week. IPM status