1 / 19

DataGrid meeting, CSC V. Karimäki (HIP) Otaniemi, 28 August, 2000

Data Intensive Computing in CMS Experiment. DataGrid meeting, CSC V. Karimäki (HIP) Otaniemi, 28 August, 2000. Outline of the Talk. LHC computing challenge Hardware challenge CMS software DataBase management system Regional Centres DataGrid WP 8 in CMS Summary.

lave
Download Presentation

DataGrid meeting, CSC V. Karimäki (HIP) Otaniemi, 28 August, 2000

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Intensive Computing in CMS Experiment • DataGrid meeting, CSC • V. Karimäki (HIP) • Otaniemi, 28 August, 2000 Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  2. Outline of the Talk • LHC computing challenge • Hardware challenge • CMS software • DataBase management system • Regional Centres • DataGrid WP 8 in CMS • Summary Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  3. Challenge: Collision rates Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  4. Challenges: Event complexity • Events: • Signal event is obscured by 20 overlapping uninteresting collisions in same crossing • Track reconstruction time at 1034 Luminosityseveral times 1033 • Time does not scale from previous generations Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  5. Challenges: Geographical dispersion • Geographical dispersion: of people and resources • Complexity: the detector and the LHC environment • Scale: Petabytes per year of data 1800 Physicists 150 Institutes 32 Countries • Major challenges associated with: •  Coordinated Use of Distributed computing resources •  Remote software development and physics analysis •  Communication and collaboration at a distance • R&D: New Forms of Distributed Systems Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  6. 40 MHz (40 TB/sec) level 1 - special hardware 75 KHz (75 GB/sec) level 2 - embedded processors 5 KHz (5 GB/sec) level 3 - PCs 100 Hz (100 MB/sec) data recording & offline analysis Challenges: Data Rates online system multi-level trigger filter out background reduce data volume Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  7. PetaByte Mass Storage Each silo has 6,000 slots, each of which can hold a 50GB cartridge ==> theoretical capacity : 1.2 PetaBytes Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  8. The new Supercomputer? From http://now.cs.berkeley.edu (The Berkeley NOW project) Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  9. Event Parallel Processing System • About 250 PCs, with 500 Pentium • processors are currently installed • for offline physics data processing Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  10. Cost Evolution: CMS 1996 Versus1999 Technology Tracking Team • Compare to 1999 Technology Tracking Team Projections for 2005 • CPU: Unit cost will be close to early prediction • Disk: Will be more expensive (by ~2) than early prediction • Tape: Currently Zero to 10% Annual Cost Decrease (Potential Problem) CMS 1996 Estimates 1996 Estimates 1996 Estimates Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  11. Data Challenge plans in CMS • Dec 2000: Level 1 trigger TDR • First large-scale productions for trigger studies • Dec 2001: DAQ TDR • Continue High Level Trigger studies; Production at Tier0 and Tier1s • Dec 2002: Software and Computing TDR • First large-scale Data Challenge (5%) • Use full chain from online farms to production in Tier0, 1, 2 centers • Dec 2003: Physics TDR • Test physics performance; need to produce large amounts of data • Verify technology choices by performing distributed analysis • Dec 2004: Second large-scale Data Challenge (20%) • Final test of scalability of the fully distributed CMS computing system Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  12. Hardware - CMS computing ~ 40 MCHF (~5 Regional Centres each ~20% of central systems) ~ 40 MCHF (Central systems at CERN) ~ 40 MCHF (?) (Universities, Tier2 centres, MC, etc..) ~ 120 MCHF Total Computing cost to 2006 inclusive ~ consistent with canonical 1/3 : 2/3 rule 10 Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  13. Computing tasks - Software • Off-line computing • Detector simulation OSCAR • Physics simulation CMKIN • Calibration • Event reconstruction ORCA • and Analysis • Event visualisation IGUANA Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  14. CMS Software Milestones We are well in schedule! Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  15. Tier2 Center ~1 TIPS Tier2 Center ~ ~ 1 TIPS Tier2 Center ~1 TIPS Tier2 Center ~1 TIPS Tier2 Center ~1 TIPS HPSS HPSS HPSS HPSS HPSS Worldwide Computing Plan • 1 TIPS = 25,000 SpecInt95 • PC (1999) = 15 SpecInt95 ~PBytes/sec Online System ~100 MBytes/sec Bunch crossing per 25 nsecs.100 triggers per secondEvent is ~1 MByte in size Offline Farm,CERN Computer Center > 20 TIPS Tier 0 +1 ~622 Mbits/sec or Air Freight Tier 1 UK Regional Center Fermilab~ 4 TIPS France Regional Center Italy Regional Center ~2.4 Gbits/sec Tier 2 ~622 Mbits/sec Tier 3 Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server Institute ~0.25TIPS Institute Institute Institute 100 - 1000 Mbits/sec Physics data cache Tier 4 Workstations Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  16. Computing at Regional Centers Model Circa 2005 Tier1 Tier3 UnivWG 1 FNAL/BNL 70k SI95 70 TB Disk Robot 622 Mb/s Tier3 UnivWG 2 Tier2 Center 20k Si95 20 TB Disk Robot N x 622 Mb/s Tier0 CERN/CMS 350k SI95 350 TB Disk Robot Tier3 UnivWG N Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  17. Network from CERN Network from Tier 2 & simulation centers Tapes Regional Centre ArchitectureExample by I. Gaines Tape Mass Storage & Disk Servers Database Servers Tier 2 Local institutes Data Import Data Export Production Reconstruction Raw/Sim  ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD  AOD AOD  DPD Scheduled Physics groups Individual Analysis AOD  DPD and plots Chaotic Physicists CERN Tapes Desktops Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  18. Objectivity Database Objectivity Database ytivitcejbO esabataD CMS production 2000 - Grid WP 8 Signal Zebra files with HITS HEPEVT ntuples CMSIM MC Prod. MB Catalog import Objectivity Database ORCA Digitization (merge signal and MB) Objectivity Database ORCA ooHit Formatter ORCA Prod. Catalog import HLT Algorithms New Reconstructed Objects Objectivity Database HLT Grp Databases Mirrored Db’s (US, Russia, Italy..) Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

  19. Summary • Challenges: high rates, large data sets, complexity, world wide dispersion, cost • Solutions: event parallism, commodity components, computing modelling, distributed computing, OO paradigm, OO database • Planning: CMS in schedules with various milestones • DataGrid WP 8: production of large number of events in fall 2000 Finnish DataGrid meeting, CSC, Otaniemi,28.8.2000 V. Karimäki (HIP)

More Related