320 likes | 435 Views
Far Detector Data Quality. Andy Blake Cambridge University. Far Detector Data Quality. The atmospheric analysis group has studied the Far Detector data in great detail. Large data set (August 2003 – February 2005). Time spent at the Soudan mine!
E N D
Far Detector Data Quality Andy Blake Cambridge University
Far Detector Data Quality • The atmospheric analysis group has studied the • Far Detector data in great detail. • Large data set (August 2003 – February 2005). • Time spent at the Soudan mine! • Performed a complete physics analysis. • Tools developed to examine data at many levels. • Raw data, cosmic muons etc… • Readout pathologies, hardware failures etc… • Cuts developed to select good and bad data. • But not in widespread use…! • Next iteration of atmospheric analysis underway. • - Need detailed data quality checks for this analysis. • - Opportunity to develop tools for wider use…! Andy Blake Cambridge University
Far Detector Data Quality Far Detector data quality issues roughly divide in two: GLOBAL DATA QUALITY ? Physics Data DCS Systems Veto Shield • Coil On • HV On • No Holes! Problems that we don’t know about yet! Far Detector CHANNEL-BY-CHANNEL QUALITY Bad Channels ? • Hot/Cold • Busy • Badly calibrated Andy Blake Cambridge University
Far Detector Data Quality Online Monitoring Database Shift Logbooks Data Quality Tools DCS Info Etc. Raw Data Cosmic Muons Calibration Info Andy Blake Cambridge University
Far Detector Data Quality Global Data Quality: (1) Select good physics runs. → Determine which runs were intended as physics data. → Identify causes of bad data (e.g. LI leaks etc.) → Produce list of good physics runs. (2) Identify when detector was in a bad state. (e.g. HV trips, coil trips, veto shield holes etc.) → (DCS) database + tools to access it. Channel-by-Channel Quality: (3) Compile record of bad detector components. (e.g. hot/cold/busy chips, bad readout, bad calibration etc.) → (HARDWARE) database + tools to access it. → Information from raw data – pass through offline framework. Andy Blake Cambridge University
Far Detector Run Selection Soudan DBU Database Run Summary SELECTION SELECTION Cambridge Fermilab DATA QUALITY CHECKS Andy Blake Cambridge University
Far Detector Run Selection • Far Detector run selection procedure : Cambridge Fermilab • Entry in database. • Run types 2, 769. • ~5000 snarls. • Run types 2, 769, 17153. • 60 seconds. • 1 snarl. • Number of subruns during period August 1st 2003 → July 31st 2005 : • This represents quite a large discrepancy! Andy Blake Cambridge University
Run Selection Issues (1) Modified Runs • Approximately 1500 subruns have “modified” bit set. • Corresponds to several weeks live time (The Far Detector DAQ • is typically left with the “modified” flag set for long periods of time). • Typical “modified” run comments*: • – Testing the E4 trigger. • – Removed Pulser Boxes from LI config. • – Flashing LI at 500Hz. • – Switched HV cards. • – “Restarting DAQ after recovery”. • – “Standard physics data”. • These runs are generally okay for use in physics analyses – • use other data quality cuts to reject any bad data. ( N.B: In contrast “test” runs typically correspond to changes in readout components – so it’s probably not wise to use these in physics analyses. ) Andy Blake Cambridge University
Run Selection Issues (2) The Database (DBUSUBRUNSUMMARY Table) • ~1200 runs have no entry at all in the database! • – most are bad runs (e.g. the DAQ crashed before the run started) • but a significant number do correspond to good physics data. • – the gaps occur before 2005 so no beam data is affected. • ~100 subruns can be recovered by searching for gaps. • – Gap in middle of run: • subrun 0 1 2 GAP 4 5 6 7 • – Gap at end of run: • subrun 0 1 2 3 4 5 6 GAP • ~300 subruns are on the Fermilab list but not in the database. • ~20 subruns are under 10 seconds according to the database, • but are actually several hours long! Andy Blake Cambridge University
Run Selection Issues (3) Short Runs • Cambridge List has ~600 subruns with < 5 minutes data. • (Mostly from March 2004 – dynode threshold scans + timing system tests, • I think that we forgot to set a snarl threshold for this month!). • Fermilab List has ~100 subruns with < 5 minutes data. • Generally suspicious of short runs: • – total number of snarls or timeframes is sometimes a “round” number. • – some runs have unusually high or low snarl rates. • – some runs have been used to test new software or components. • – all test/modified data before May 2004 is just labelled as “normal data”. Run Selection Summary • Most of the differences between the Cambridge and Fermilab run lists • can easily be explained – but there are still some discrepancies. Andy Blake Cambridge University
New Data Selection New Cambridge data selection: Select runs using database: Run types 2, 769, 17153, >5 minutes data. Fill in gaps in the database! Select “good” physics events. Physics Analysis. Good data! Select “clean” cosmic muons. Timing Calibration. Andy Blake Cambridge University
Physics Data Select “Good” Physics Events: • Data should contain events with: • – correct trigger bit(s) set. • – 10 < digits < 1000. • – LI channels < 500 • – dead chips < 20. • – dead crates = 0 • – event rate < 75 Hz Select physics events. Remove high voltage trips + data with incomplete ROP mask. Remove anomalously high rates. • These cuts have the following effects: • – remove runs where HV is down or readout is incomplete. • – remove “normal data” which isn’t actually physics data! • – remove runs with anomalous events or rates. Andy Blake Cambridge University
Physics Data singles < 50 Hz August 1st 2003 → January 31st 2005 require less than 20 dead chips Andy Blake Cambridge University
Physics Data August 1st 2003 → January 31st 2005 require all 32 half-crates to be working! Andy Blake Cambridge University
Physics Data singles > 2500 Hz August 1st 2003 → January 31st 2005 300 Hz Light Injection! Is this bad? Andy Blake Cambridge University
Physics Data Raw Snarls August 1st 2003 → January 31st 2005 Andy Blake Cambridge University
Physics Data Filtered Snarls August 1st 2003 → January 31st 2005 Andy Blake Cambridge University
Physics Data Filtered Snarls August 1st 2003 → January 31st 2005 remove these runs! require event rate less than 75 Hz Andy Blake Cambridge University
Cosmic Muons Select “Clean” Cosmic Muons: • Data should contain events with: • – Hits in >10 planes. • – Hits in >3 planes in each view. • – Satisfies straight line fit (rms <1 cm). • [ typically these events occur every ~3 seconds ]. • – Cosmic muon rate < 1 Hz. Select cosmic muons. Remove anomalously high rates. • These cuts have the following effects: • – Select clean events for timing calibration. • – Remove runs with anomalous events or rates. Andy Blake Cambridge University
Cosmic Muons August 1st 2003 → January 31st 2005 Andy Blake Cambridge University
Cosmic Muons August 1st 2003 → January 31st 2005 remove these runs! require muon rate less than 1 Hz Andy Blake Cambridge University
Feedback from Analysis Please tell me if you find any bad runs! Andy Blake Cambridge University
Aside: VARC Errors • Far Detector readout errors recorded in “VarcErrorInTfBlocks” • Two types of readout error are reported: • – “ETC” errors (e.g. error in ETC readout, overflow of FIFO). • – “Sparsifier” errors (e.g. corruption in FIFO, overflow of VME buffer). • Data corruption or buffer overflows could corrupt or truncate • physics events – need to be careful! Andy Blake Cambridge University
VARC Errors VA Readout Card signal VMM control ETC digital signal VMM ETC VMM ETC VME READOUT PROCESSOR VMM ETC Sparsifier VME Buffers VMM ETC data stored in FIFOs. VMM data stored in VME buffers. ETC Andy Blake Cambridge University
VARC Errors FIFOs SPARSIFIER BUFFER VME TRANSFER SPARSIFIER ~250 kHz (VA chips) Singles: ~15 kHz (detector) ~10 kHz (shield) Light Injection: ~30 x Rate LI@300Hz ~10 kHz Total: ~35 kHz VME TRANSFER ~320 kHz (VA channels) Singles: ~15 kHz (detector) ~10 kHz (shield) Light Injection: ~300 x Rate LI@300Hz ~100 kHz Total: ~125 kHz Far Detector data rates should be well within readout capabilities. Andy Blake Cambridge University
VARC Errors August 1st 2003 → April 30th 2004 25% DEAD TIME 100% DEAD TIME Andy Blake Cambridge University
VARC Errors August 2003 BIT 0 – “ETC FIFO has overflowed” BIT 4 – “VME buffer has overflowed” Andy Blake Cambridge University
VARC Errors August 1st 2003 → April 30th 2004 All 259 ETCs in detector! All 46 VARCs in detector! Something went horribly wrong! Andy Blake Cambridge University
VARC Errors Error bit maps from a sample of data: The error bits look really crazy! Andy Blake Cambridge University
VARC Errors August 1st 2003 → January 31st 2005 Sep 26th 2003 → Apr 1st 2004 August 1st 2003 → January 31st 2005 Andy Blake Cambridge University
VARC Errors ??? Andy Blake Cambridge University
Summary • Making progress towards a complete run selection. • – Have analysed data from August 1st 2003 → January 31st 2005. • Have uncovered some new issues. • – The database has a large number of gaps. • – The appearance and disappearance of VARC errors. • Things that need doing. • – HV/coil status database + access tools. • – Hardware database + access tools. Andy Blake Cambridge University