1 / 39

Run Coordinator Report on behalf of everybody involved in Pit Operation

Run Coordinator Report on behalf of everybody involved in Pit Operation . First, 2010 is a major achievement! THANKS EVERYBODY! . Seasonal Vacation!?. All lot of work in a relatively short Winter Stop!  Not just a pit stop for tire exchange but rather engine overhaul….

vivi
Download Presentation

Run Coordinator Report on behalf of everybody involved in Pit Operation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Run CoordinatorReporton behalf of everybody involved in Pit Operation First, 2010 is a major achievement! THANKS EVERYBODY!

  2. Seasonal Vacation!? • All lot of work in a relatively short Winter Stop!  Not just a pit stop for tire exchange but rather engine overhaul… Was it just a near miss to disaster? • NO! Far from, but we are not over the hill for 2011

  3. 2010 Challenges – Extreme conditions • Operational objective in retrospect: • Explore on LHCb physics potential • Explore and tune detector, trigger and readout performance • June MD to go to nominal intensity and THEN increase number of bunches very beneficial but a lot of uncertainty in the luminosity (evolution) per bunch • 80% of design luminosity reached with 344 colliding bunches instead of 2622… Average number of visible interactions per crossing ? LHCb Design Specs •  Faced with preparations without knowledge about the ultimate parameters • Cannot formulate running conditions and operate this way next year July September October August

  4. 2010 Challenges - Commissioning • 89 physics fills • Very limited day-time to commissioning properly trigger and tuning with non- CERN based experts/developers and continued increase of pile-up/bunches

  5. Global Operational Performance • Main source of operational difficulty • Changing ‘surprise’ conditions rather than extreme conditions EFF Upgrade! • CMS: 43.2 pb-1 / 47.0 pb-1 : 91.2% • 84% usable by any analysis • >92% for muons only • Atlas: 45.0 pb-1 / 48.2 pb-1 : 93.6% • 93 – 98 % efficiency • Except a few one-off problems and “shock-m” • Most luminosity delivered with largest geometrical reduction factor • We only got 42 pb-1 delivered out of the promised 50 pb-1

  6. Luminosity Discrepancy • Systematic luminosity difference IP1/5 and IP8 – Not understood • Geometrical factor • July – August: LHCb 2x270 mrad 8-9% as compared to Atlas/CMS with 0mrad • B up+aext: LHCb 2 x (270 – 100) mrad 3% as compared to Atlas/CMS with 200 mrad • B down+aext: LHCb 2 x (270 + 100) mrad 9% as compared to Atlas/CMS with 200 mrad • Normalization – work starting up to normalize via Alice • b* / Waist effect?  Observations of strange geometrical effects during scans B down July September October August • Will not be an issue 2011 as soon as we reach our maximum total luminosity • 2*1032 – 5*1032 cm-2s-1

  7. Trigger Compromise • We received individual requests, complaints, and praising for our struggle • It wasn’t always so easy, many PPG-OPG evening and weekend email exchanges to understand and find best solutions HLT Rate (t0) ~ 2.5 kHz Luminosity (t0) ~ 1.5E32 Mu (t0) ~ 2.5 Trigger deadtime (t0) ~ 1% L0 Rate (t0) ~ 350 kHz TCK change 2.2 pb-1 19.1 pb-1 12.7 pb-1

  8. End of Fill Procedure Beam Dump handshake • Lost almost 0.8 pb-1 between beam dump warning and actual dump in total! • Modification • Movable Device Allowed flag will become “TRUE” also in BEAM DUMP mode • Dump handshake remains the same • But we no longer “protect” the VELO by dumping the beam if the VELO is not in garage position when LHC intends to dump the beam…. • May still retract VELO but more room for flexibility in software • INJECTION and ADJUST logic remains the same obviously Total 29 handshakes out of 58 fills Sum of difference Total – LHCb = 100min Luckily most fills were lost! ;-) DT(WARNING  READY)[min] LHCb

  9. Normalized Fill Efficiencies • All fills normalized to 1 pb-1 94% 233 bunches High luminosity with high mu Commissioning trigger nominal bunches SD DAQ problem during 1.5h fill EFF Upgrade Detector Safety System Commercial Hardware fault July September October August

  10. Event Filter Farm ‘Real-Time’ Upgrade 50 subfarms of 19 nodes • 100 servers x 4 farm nodes • Configuration/Start Run >20min  6min by custom-made NFS •  Installed and commissioning in three days 5-8 October, fully ready for fill 1408 • 50 Subfarms with two new servers (= 2x4 farm nodes) installed in each • 19 farm nodes/sum-farm • 4 low-end with 8 trigger tasks, 7 middle-end with 12 tasks, 8 high-end with 20 tasks • Summary= 950 farm nodes with triple farm capacity • Another 100 x 4 farm nodes during winter stop

  11. Operational Difficulties 2010 • Main sources of operational inefficiencies in short • Changing conditions rather than extreme conditions • Lack of real knowledge about luminosity evolution • Shifter experience and instructions (not the fault of the shifter!) • Operational Parameters and system limits • Trigger rate • CPU consumption • Trigger optimized for m ~ 1.6 and 350 bunches AND OLD FARM (luckily…) • Event size • One bottle neck was hiding another (HLT CPU  L0 bandwidth) • Detector stability • HV trips • HPD disabling • Wrong configuration • Desynchronization • OPC servers • … • Diagnostics tools, diagnostics tools, diagnostics tools!

  12. System Performance – School Example • Impressive! • Many things to analyze, understand and tune • In particular with the complete farm • We “lose” some nodes during running • I.e. For some reason ODIN doesn’t receive their event requests all of sudden TCK change Trigger Livetime Luminosity Event request rate System Latency Lost nodes O(%) Available Farm Nodes Destination Search Time Fill 1453

  13. Global Operation Observations Readout 2011 • 1 MHz L0 readout • Only proven on “paper” up to now (partially and momentarily with idle system) • Loaded system has a VERY different behaviour as observed already 2010 • L0 bandwidth per TELL/UKL1: Work in progress  Test • Readout network and storage bandwidth: Should be OK but recabling and additional switch • CPU capacity: Extensive testing with 2011 trigger • Challenge and work intensive for the next 6 months • “Trigger Boundaries” reached within 6 months • Load balancing: Monitoring and diagnostics  System Performance Overview panel • We need time to test all of this extensively! • Running at limits • Rate and event size (= potential deadtime) influenced by beam orbit variations with displaced beams, background (e.g. beam-gas@vacuum), and de-bunching, … • Careful about running at margin

  14. Global Operation Observations Farm management and Controls • Configuration speed consolidation • More dynamic farm control needed  Majority logic in FSM on CONFIGURE and START RUN to go to READY/RUNNING • 10-20% is sufficient to start and prepare the rest on the fly similar to recovery mechanism • On CONFIGURE, de-centralize FSM logic to allow nodes to continue from state OFFLINE READY independently of the state of the other nodes • Reconstruction and Monitoring Farm not needed either to (start) take data, only to take GOOD data  If in trouble get them going once data taking has already started • Monitoring of incomplete events by counters (in e.g. Node Status panel) • Farm system performance overview • Farm log messages: Global limit per message for entire farm and not per node…. • More (proper) use of Message Levels

  15. Global Operational Observations Trigger • Operational (functional) diagnostics • Performance monitoring • We really don't have many knobs, HLT we have to do more in shorter time than before Support for detector performance monitoring • All sub-detector scans with beam • Some should be performed regularly (every n pb-1) • Devise proper scheme for each and combinations • Permanent Trigger Configurations (with downscaling) • Express needs in terms of integrated luminosity • Regular scans must be supported by automated recipes • Work and testing during shutdown for the recommissioning in March high priority! Data quality • To good in 2010? • Ad-hoc treatments of trips and other problems • “Should” become an issue in 2011… • Need to be attentive and have the tools and improve feedback • Watch closely experimental conditions and detector effects • RMS – Radiation Monitoring System should become important 2011 • Proposal for new back-end readout to replace VME scaler • TFC HUGIN (throttle-OR) is a very flexible hi-speed multi-channel board!

  16. Approach to Running Conditions 2011 • Note on “decision” about m and L: • m has mainly hard limits – rates*event size, CPU time, reconstruction etc • L has soft limits – detector stability • Unknown domain of detector operation and unknown domain of accelerator operation • Optimize d/dmShi*eOP1/2 * [s/b1/2] = 0 for physics output • where hi importance factor for a specific physics analysis • Operational stability eOP = eDAQ+ edead-time > 95% • Of course we should also be able to store and process events in reasonable time • Ageing – No problem 2010 • Not necessarily a problem if we assume linear relation with particle flux and we collect more usable luminosity in shorter time • LHCb lifetime is integrated luminosity not years  Focus on understanding of ageing mechanism and prognosis • Technical ambition 2011 • 2011: Luminosity increase 2-3x (In 2010: 500x between July and November) • Operationally aim for m~2.0 – 2.5 • Total luminosity 2 – 5 x 1032 cm-2s-1 3-4 x 1032 cm-2s-1 realistic my feeling from last year • Main consequences • Careful to run at limit of capacity • Manpower to monitor and follow-up on experiment conditions and detector effects • Regular scans to understand ageing/detector effects and the associated luminosity penalty • Pre-prepared extreme and liberal alternative trigger configurations allowing for flexibility

  17. Luminosity Leveling by Collision Offset X (IP  t=0) • Luminosity leveling applied several times during 2010 • First time on July 17 and July 18 • In the steps between trigger configurations • Followed bunch behaviour with VELO/BLS and no sign of problems • Two beam stability tests done • 152 bunches x 1E11 @ 150ns up to more than 1 sigma • 100 bunches x 0.9E11 @ 50ns up to 6sigma • Tests with several 100 bunches and high intensity not done Last but most important consequence: Luminosity leveling is crucial to run LHCb at optimum luminosity 2011

  18. L0 Rate Variation • L0 rate sensitive to many effects • Collision offset  Orbit variations of 20% - 25% of beam sigma  up to 10% in rate • Background such a beam-gas • Luminosity control communication and application and information latency L0 Rate vs mu Luminosity reduction vs sigma L0 Rate vs sigma

  19. Beam-Gas and Vacuum • No visible effect of any increase vacuum in LSS8 during 2010 • Sensitivity at L0 trigger • Expect a rate of potentially visible (one track in cavern) beam-gas in LHCb at normal pressure of 1E11/1.6E14 * 20% * 11.245 kHz = 1.4Hz/bunch • L0 selection efficiency 3.1%  16 Hz @ 368 bunches • Also, increased probability to accept single MB event when accompanied by beam-gas • For MB events with no pileup L0 selection 3.8%  6% • With high pileup effect is less visible  Estimated O(10Hz) • At nominal pressure few 10 Hz of beam-gas at 368 bunches • Even increasing pressure by 100x is a no worry • Increasing vacuum pressure locally will only have a partial effect • BUT it adds to particle flux (detector stability and occupancy)

  20. L0 Rate Impact on Deadtime • Pure L0 Rate limited by the “L0 Derandomizer” readout scheme • 1 clock cycles to put event in • 36 clock cycles to read event out : (36*25ns)-1 = 1.1111.. MHz • 16 deep • Common specs emulated by ODIN to regulate L0 • Upper water mark 16 events, lower water mark 15 events  However, write/read controllers more complicated • Exception to global specifications: • OTIS chip of OT – Proper emulation in ODIN all 2010 • Beetle of VELO and ST – Work in progress • Consequence of no Beetle emulation  • Upper water mark 8 events, lower water mark at 3 events From L0 Pipeline on L0 accept Write/read controller To TELL1/UKL1

  21. Derandomizer & L0 Rate & Filling Schemes • Deadtime effect of running at high rate with few bunches • Deadtime worse with fewer bunches!! 50ns , 600 colliding bunches 50ns , 800 colliding bunches PHYSICS TRIGGER DEADTIME 25ns , 2440 colliding bunches 75ns , 670 colliding bunches

  22. Injection – LHCb a Sitting Duck • Injection Losses from un-captured beam • Already difficult in 2010 with 0.3% • Expected to get worse 2011 with up to 1% • Culmination on October 30 with 8b-injections • Shot blew a fuse in CALO HV distribution! • 30% BCM levels agree with 30% BLM levels • We almost became show stopper • Immediate actions • LHC: SPS 800 MHz cavity problem and SPS scraping • LHCb: Disable 40ms logic during injection phase and raise thresholds (2x-3x) • Done in a few hours • LHC: Investigate using shifted Abort Gap Cleaning during injection • Check timing and origin of splashes with Beam Loss Scintillators  Improved situation significantly and took us through the year B Beam 2 from SPS (TI8)

  23. Injection –Actions 2011 • Actions for 2011 • Switch off/lower HV AND LV of sensitive detectors in LHCb during injection • Complicated since we need to configure and run LHCb WELL BEFORE next data taking • Requires quite a lot of work on DAQ and CONTROL • Check timing and origin of splashes with Beam Loss Scintillator and BCM • Injection Quality information from BLS+BCM fed back to LHC on each injection • Shielding being investigated together with machine • Blind BCM during the injection shot using Injection Pulse on direct fibre from RF • No relaxed attitude… SPS satellites LHC uncaptured

  24. Injection Schemes – Just an Idea • For 75ns • ~100    (8) + 4 x (24) • ~200    (8) + 8 x (24) • ~300    (8) + 12 x (24) • ~400    (8) + 8 x (48) • ~500    (8) + 8 x (48)  + 4 x (24) • ~600    (8) + 12 x (48) • ~700    (8) + 8 x (72) + 4 x (24) • ~800    (8) + 8 x (72) + 4 x (48) • ~900    (8) + 12 x (72) • For 50 ns it will be similar progression to max 1400b (!!), maybe something like:  • ~100    (12) + 8 x (12) • ~200    (12) + 16 x (12) • ~300    (12) + 8 x (36) • ~400    (12) + 12 x (36) • ~5/600 (12) + 8 x (72) • ~700    (12) + 8 x (72) + 4 x (36) • ~800    (12) + 12 x (72) • ~9/1000 (12) + 8 x (108) + 4 x (36) • ~1200    (12) + 12 x (108) • ~1400   (12) + 12 x (108) + 4 x (36)

  25. LHCb Re-commissioning Plan 2011 PRELIMINARY • Luminosity ramp: back-up to 300 in 50 bunch steps with 75ns •  3 weeks  2-3 days per step

  26. Bunch Ramp Up • From Mike Lamont: • 2 to 3 weeks re-commissioning • Virgin set-up followed by full validation (loss maps, asynchronous dumps etc.) • 2011 – back-up to 300 in 50 bunch steps • Would imagine starting with 75 ns • 2010 around 4 days (minimum) per 50 bunch step • 50 – 100 – 150 – 200 – 250 – 300 • Around 3 weeks to get back to 300 bunches • 100 bunch steps thereafter. • 400 – 500 – 600 – 700 – 800 – 900 • 3 weeks minimum • Ultimate parameters for 2011 (Qb:1.6E11 x eN:2E6 x Nb:1400)

  27. Annual Shift Summary • Summary includes 2008 – 2009 – 2010 because individual function counters were not reset • Total: 7660 shifts equivalent to 13.4 months of running Number of shifts per function Shifts Number of equivalent months Months

  28. Annual Shift Summary • Each author (507) should have contributed to 15.1 shift slots in this period • Total number of shifters: 297  Each shifter contributed to 25.8 shift slots Number of Shifters compared to Authors Number of Functions

  29. Annual Shift Summary Pit Shifts Offline Shifts Number of Shifts Piquet Shifts

  30. Rise or Sugar Normalized Shift Contribution

  31. Shifts 2011 • Current shift situation (number of shifters we have had): • Shift Leader: 53 • Data Manager: 109 • Production: 36 • Data Quality: 58 • How many are still active and how many are available 2011?  Poll • Answer to my mail about availability for 2011 if you are already shifter • Answer to my call for shifters at the beginning of next year • Refresher and trainings in February – March • HV training (sensibilization) and VELO closure • Improve training of SL and DM together with sub-detectors • Shifter online running instructions, helps and trouble shooting

  32. Conclusion • A huge thanks to everybody who baby-sat, operated and nursed LHCb! • I don’t think we can repeat this enough! • Stop meeting and reporting and go back to the office to take care of Our New Year Promises • Since I only got 20 minutes for this talk, I’ll stop the conclusion here! MERRY CHRISTMAS A HAPPY END OF 2010 HAPPY START 2011 1 fb-1

  33. Spare Slides….

  34. Workshops on Operation 2010 • 2010 Running (Autopsy) Postmortem Workshop scope • Collect (recall!) flaws and drawbacks from 2010 operation • Hopefully with some associated solution • If not, what is needed, how do we address it? • Works and improvements during shutdown • Planning and manpower • Needs for re-commissioning and special runs 2011 • Magnet OFF data preferably at 3.5 TeV • Etc • Main worries for 2011 • Sub-detector guesstimates of luminosity tolerance • Manpower for next year • Will not summarize operational performance 2010 and whole workshop here (obviously…) • A veeery long do-list – just main points • http://indico.cern.ch/conferenceDisplay.py?confId=113227 • Revisit situation end of January – beginning February • Also reported yesterday on all aspects of operation with beam to LHC in LPC meeting • Andreas reported on desiderata for 2011 • Input to LHC workshop in Evian December and Chamonix • http://indico.cern.ch/conferenceDisplay.py?confId=111076 • See Andreas’ talk next

  35. Detector Operation 2011 • Purely in terms of operation all depends on detector stability • Operating at 50ns • Experiment conditions • Beam-beam effects from bunch behaviour • Background (electron cloud + IBS) • VELO foil temperature and HV trips • Displacing beams (up to several sigmas) • Spill-over/signal pileup • Spill-over effects in all detectors but RICH • Event size at L0 • Reconstruction performance  Short 50ns (1 fill @ 100 bunches) run allowed only partially address these • 75ns as long as possible and beneficial

  36. Luminosity • Two online sources, with several x-checks • LHCb detector • Beam Loss Scintillator • Independent from LHCb DAQ • Auto-calibrated with LHCb detector while running • Very reliable and versatile Combination sent to LHC as delivered lumi • Applications • Injection quality • Background with high time resolution • Beam-gas rate monitoring and veto in trigger • Luminosity • Debunched beam • Upgrade of BLS during shutdown • Faster PMT(quartz)+cable no spillover and additional scintillators

  37. Longitudinal Scan X (IP  t=0) t~+dT/2 • Shifting timing of beam 2 +/-1ns Z ~30 mm (20mm with 90 mrad) ~10 mm (5 mm with 25 mrad) ~200 mm

  38. Longitudinal Scan • Several questions about results: • Indicates something fundamental? • T0 good for VELO (Z~0) • Bad transversal optimization? • Lumi region z-size? • Should have done mini-scan after • Repeat next year! Ratio ~9% Did we loose optimization? Specific luminosity Nominal Physics Lumi region z-size decreases strongly when z<0? Luminosity

  39. VDM Scan – Lumi Region Movements Horisontal 2 beams 6s Vertical 2 beams 6s 5mm effect from XY-rotation of 13 mrad ~90 mm (100mm with 90 mrad) ? ~40 mm (30mm with 25 mrad) 1200mm (6s @170mrad) Courtesy C. Barschel

More Related