1 / 41

The Evolution of HEP software

The Evolution of HEP software. 12 September 2013, NEC2013/Varna René Brun/CERN*. plan. In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP.

stefan
Download Presentation

The Evolution of HEP software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Evolutionof HEP software 12 September 2013, NEC2013/Varna René Brun/CERN*

  2. plan • In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP. • Having been involved in the design and implementation of many systems, my views are necessarily biased by my path in several experimentsand the development of some general tools. • I plan to describe the creation and evolution of the main systems that have shaped the current HEP software, with some views for the near future. R.Brun : Evolution of HEP software

  3. Machines From Mainframes ===== Clusters Walls of cores GRIDs & Clouds R.Brun : Evolution of HEP software

  4. Machine Units (bits) 16 32 36 48 56 60 64 pdp 11 univac many cdc many nord50 besm6 A strong push to develop portable machine independent I/O systems With even more combinations of exponent/mantissa size or byte ordering R.Brun : Evolution of HEP software

  5. User machine interface R.Brun : Evolution of HEP software

  6. General Software in 1973 • Software for bubble chambers: Thresh, Grind, Hydra • Histogram tool: SUMX from Berkeley • Simulation with EGS3 (SLAC), MCNP(Oak Ridge) • Small Fortran IV programs (1000 LOC, 50 kbytes) • Punched cards, line printers, pen plotters (GD3) • Small archive libraries (cernlib), lib.a R.Brun : Evolution of HEP software

  7. Software in 1974 • First “Large Electronic Experiments” • Data Handling Division == Track Chambers • Well organized software in TC with HYDRA, Thresh, Grind, anarchy elsewhere • HBOOK: from 3 routines to 100, from 3 users to many • First software group in DD R.Brun : Evolution of HEP software

  8. GEANT1 in 1975 • Very basic framework to drive a simulation program, reading data cards with FFREAD, step actions with GUSTEP, GUNEXT, apply mag-field (GUFLD). • Output (Hist/Digits) was user defined • Histograms with HBOOK • About 2,000 LOC R.Brun : Evolution of HEP software

  9. ZBOOK in 1975 • Extraction of the HBOOK memory manager in an independent package. • Creation of banks and data structures anywhere in common blocks • Machine independent I/O, sequential and random • About 5,000 LOC R.Brun : Evolution of HEP software

  10. GEANT2 in 1976 • Extension of GEANT1 with more physics (e-showers based on a subset of EGS, mult-scattering, decays, energy loss • Kinematics, hits/digits data structures in ZBOOK • Used by several SPS experiments (NA3, NA4, NA10, Omega) • About 10,000 LOC R.Brun : Evolution of HEP software

  11. Problems with GEANT2 • Very successful small framework. • However, the detector description was user written and defined via “if” statements at tracking time. • This was becoming a hard task for large and always evolving detectors (case with NA4 and C.Rubbia) • Many attempts to describe a detector geometry via data cards (a bit like XML), but the main problem was the poor and inefficient detector description in memory. R.Brun : Evolution of HEP software

  12. GEANT3 in 1980 • A data structure (ZBOOK tree) describing complex geometries introduced , then gradually the geometry routines computing distances, etc • This was a huge step forward implemented first in OPAL, then L3 and ALEPH. • Full electromagnetic showers (first based on EGS, then own developments) R.Brun : Evolution of HEP software

  13. Systems in 1980 End user Analysis software 10 KLOC Experiment Software 100 KLOC Libraries HBOOK, Naglib, cernlib 500 KLOC RAM 1 MB OS & fortran 1000 KLOC Tapes CDC, IBM Vax780 R.Brun : Evolution of HEP software

  14. GEANT3 with ZEBRA • ZEBRA was very rapidly implemented in 1983. • We introduced ZEBRA in GEANT3 in 1984. • From 1984 to 1993 we introduced plenty of new features in GEANT3: extensions of the geometry, hadronic models with Tatina, Gheisha and Fluka, Graphics tools. • In 1998, GEANT3 interface with ROOT via the VMC (Virtual Monte Carlo) • GEANT3 has been used and still in use by many experiments. R.Brun : Evolution of HEP software

  15. PAW • First minimal version in 1984 • Attempt to merge with GEP (DESY) in 1985, but take the idea of ntuples for storage and analysis. GEP was written in PL1. • Package growing until 1994 with more and more functions. Column-wise ntuplesin 1990. • Users liked it, mainly once the system was frozen in 1994. R.Brun : Evolution of HEP software

  16. Vectorization attempts • During the years 1985->1990 a big effort was invested in vectorizing GEANT3 (work in collaboration with Florida State University) on CRAY/YMP, CYBER205,ETA10. • The minor gains obtained did not justify the big manpower investment. GEANT3 transport was still essentially sequential and we had a big overhead with vectors creation, gather/scatter. • However this experience and failure was very important for us and many messages useful for the design of GEANT5 many years later. R.Brun : Evolution of HEP software

  17. Parallelism in the 80s & early 90s • Many attempts (all failing) with parallel architectures • Transputers and OCCAM • MPP (CM2, CM5, ELXI,..) with OpenMP-like software • Too many GLOBAL variables/structures with Fortran common blocks. • RISC architectures or emulators perceived as a cheaper solution in the early 90s. • Then MPPs died with the advent of the Pentium Pro (1994) and farms of PCs or workstations. R.Brun : Evolution of HEP software

  18. 1992: CHEP Annecy • Web, web, web, web………… • Attempts to replace/upgrade ZEBRA to support/use F90 modules and structures, but modules parsing and analysis was thought to be too difficult. • With ZEBRA the bank description was within the bank itself (just a few bits). A bank was typically a few integers followed by a dynamic array of floats/doubles. • We did not realize at the time that parsing user data structures was going to be a big challenge!! R.Brun : Evolution of HEP software

  19. Consequences • In 1993/1994 performance was not anymore the main problem. • Our field invaded by computer scientists. • Program design, object-oriented programming , move to more sexy languages was becoming a priority. • The “goal” was thought less important than the “how” • This situation deteriorates even more with the death of the SSC. R.Brun : Evolution of HEP software

  20. 1993: Warning Danger • 3 “clans” in my group • 1/3 pro F90 • 1/3 pro C++ • 1/3 pro commercial products (any language) for graphics, User Interfaces, I/O and data bases • My proposal to continue with PAW, develop ZOO(ZEBRA Object-Oriented) and GEANT3 geometry in C++ is not accepted. • EvolutionvsRevolution R.Brun : Evolution of HEP software

  21. 1995: roads for ROOT • The official line was with GEANT4 and Objectivity, not much room left for success with an alternative product when you are alone. • The best tactic had to be a mixture of sociology , technicalities and very hard work. • Strong support from PAW and GEANT3 users • Strong support from HP (workstations + manpower) • In November we were ready for a first ROOT show • Java is announced (problem?) R.Brun : Evolution of HEP software

  22. 1998: work & smile • RUN II projects at FNAL • Data Analysis and Visualization • Data Formats and storage • ROOT competing with HistoScope, JAS, LHC++ • CHEP98 (September) Chicago • ROOT selected by FNAL, followed by RHIC • Vital decision for ROOT • But official support at CERN only in 2002 R.Brun : Evolution of HEP software

  23. ROOT evolution • No time to discuss the creation/evolution of the 110 ROOT shared libs/packages. • ROOT has gradually evolved from a data storage, analysis and visualization system to a more general software environment replacing totally what was known before as CERNLIB. • This has been possible thanks to MANY contributors from experiments, labs or people working on other fields. • ROOT6 coming soon includes a new interpret CLING and supports all the C++11 features R.Brun : Evolution of HEP software

  24. Input/Output: Major Steps User written streamers filling TBuffer member-wise streaming for STL collections<T*> streamers generated by rootcint TreeCache automatic streamers from dictionary with StreamerInfos in self-describing files parallel merge member-wise streaming for TClonesArray R.Brun : Evolution of HEP software

  25. GEANT4 Evolution • GEANT4 is an important software tool for current experiments with more and more physics improvements and validation procedures. • However, the GEANT4 transport system is not any more suitable for parallel architectures. Too many changes are required. • GEANT5: keep the Geant4 physics and a radically new transport system. R.Brun : Evolution of HEP software

  26. Tools & Libs Geant4+5 geant4 geant3 geant1 geant2 bos minuit hbook Root 1,2,3,4,5,6 paw zbook zebra hydra R.Brun : Computing in HEP

  27. Systemstoday End user Analysis software Networks 10 Gbit/s 0.1 MLOC Disks 1o PB Experiment Software 4MLOC Frameworks like ROOT, Geant4 5MLOC CLOUDS RAM 16 GB OS & compilers 20 MLOC GRIDS Hardware Hardware Hardware Hardware Clusters of multi-core machines 10000x8 R.Brun : Evolution of HEP software

  28. Systems in 2025 ? End user Analysis software Networks 100 Gbit/s 0.2MLOC Networks 100 Gbit/s Networks 10 Tbit/s Experiment Software 10MLOC Disks 1o00 PB Frameworks like ROOT, Geant5 10MLOC CLOUDS on demand RAM 10 TB OS & compilers 40 MLOC GRIDS Hardware Hardware Multi-level parallel machines 10000x1000x1000 Hardware Hardware R.Brun : Evolution of HEP software

  29. BUT !!!!! • It looks like the amount of money devoted to computing is not going to increase with the same slope as it used to increase in the past few years. • The Moore’s law does not apply anymore for one single processor. • However, the Moore’s law looks still OK when looking at the amount of computing delivered/$, € when REALLY using parallel architectures. • Using these architectures is going to be a big challenge, but we do not have the choice!!!! R.Brun : Evolution of HEP software

  30. Software and Hardware • GRIDs/Clouds are inherently parallel. However, because the hardware has been relatively cheap, GRIDs have pushed towards job-level parallelism at the expense of parallelism within one job. • It is not clear today what will be the winning hardware systems: supercomputer?, walls of cores with accelerators?, zillions of ARM-like systems?,.. • Our software must be upgraded keeping in mind all these possible solutions. A big challenge! R.Brun : Evolution of HEP software

  31. Expected Directions • Parallelism: Today we do not exploit well the existing hardware (0.6 instructions/cycle in average) because our code was designed “sequential”. Important gains foreseen (10?), eg in detector simulation. • Automatic Data Caches: Many improvements are required to speed-up and simplify skimming procedures and data analysis. R.Brun : Evolution of HEP software

  32. Data caches • More effort is required to simplify the analysis of large data sets (typically ROOT Trees). • When zillions of files are distributed in Tiers1/2, automatic, transparent, performing, safe caches are becoming mandatory on Tiers2/3 or even laptops. • This must be taken into account in the dilemma: sending jobs to data or vice-versa. • This will require changes in ROOT itself and in the various data handling or parallel file systems. R.Brun : Evolution of HEP software

  33. Parallelism: key points Minimize the sequential/synchronization parts (Amdhallaw): Verydifficult Run the same code (processes) on all cores to optimize the memory use (code and read-only data sharing) Job-levelisbetterthanevent-levelparallelism for offline systems. Use the good-oldprinciple of data localityto minimize the cache misses. Exploit the vectorcapabilitiesbut becarefulwith the new/delete/gather/scatterproblem Reorganizeyour code to reducetails R.Brun : Evolution of HEP software

  34. Data Structures & parallelism C++ pointers specific to a process event event vertices Copying the structure implies a relocation of all pointers I/O is a nightmare tracks Update of the structure from a different thread implies a lock/mutex R.Brun : Evolution of HEP software

  35. Data Structures & Locality sparse data structures defeat the system memory caches For example: group the cross-sections for all processes per materialinstead of all materials per process Group objectelements/collections suchthat the storage matches the traversalprocesses R.Brun : Evolution of HEP software

  36. Create Vectors& exploit Locality • By making vectors , you optimize the instruction cache (gain >2) and data cache (gain >2) • By making vectors, you can use the built-in pipeline instructions of existing processors (gain >2) • But, there is no point in making vectors if your algorithm is still sequential or badly designed for parallelism, eg: • Too many threads synchronization points (Amdhal) • Vectors gather/scatter R.Brun : Evolution of HEP software

  37. Conventional Transport T2 o Eachparticletrackedstep by stepthroughhundreds of volumes o o o o o o o o o o o o T4 o o o o o o o T1 o o o o o o o o o o o o o o o o o o o when all hits for all tracks are in memory summable digits are computed o o o T3 LPCC workshop Rene Brun

  38. Analogywith car traffic LPCC workshop Rene Brun

  39. New Transport Scheme T2 o o o o All particles in the same volume type are transported in parallel. Particles entering new volumes or generated are accumulated in the volume basket. o o o o o o o o o T4 o o o o o o o T1 o o o o o o o o o o o o o o o o o Events for which all hits are available are digitized in parallel o o o o o T3 LPCC workshop Rene Brun

  40. TowardsParallel Software • A long way to go!! • There is no point in justmakingyour code thread-safe. Use of parallel architectures requires a deeprethinking of the algorithms and dataflow. • One suchprojectis GEANT GEANT4+5launched 2 yearsago. Westarthavingveryniceresults. But still a long way to go to adapt (or write radically new software) for the emerging parallel systems. R.Brun : Evolution of HEP software

  41. A global effort • Software development is nowadays a world-wide effort with people scattered in many labs developing simulation, production or analysis code. • It remains a very interesting area for new people not scared by big challenges. • I had the fantastic opportunity to work for many decades in the development of many general tools in close cooperation with many people to whom I am very grateful. R.Brun : Evolution of HEP software

More Related