410 likes | 527 Views
The Evolution of HEP software. 12 September 2013, NEC2013/Varna René Brun/CERN*. plan. In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP.
The Evolutionof HEP software 12 September 2013, NEC2013/Varna René Brun/CERN*
plan • In this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP. • Having been involved in the design and implementation of many systems, my views are necessarily biased by my path in several experimentsand the development of some general tools. • I plan to describe the creation and evolution of the main systems that have shaped the current HEP software, with some views for the near future. R.Brun : Evolution of HEP software
Machines From Mainframes ===== Clusters Walls of cores GRIDs & Clouds R.Brun : Evolution of HEP software
Machine Units (bits) 16 32 36 48 56 60 64 pdp 11 univac many cdc many nord50 besm6 A strong push to develop portable machine independent I/O systems With even more combinations of exponent/mantissa size or byte ordering R.Brun : Evolution of HEP software
User machine interface R.Brun : Evolution of HEP software
General Software in 1973 • Software for bubble chambers: Thresh, Grind, Hydra • Histogram tool: SUMX from Berkeley • Simulation with EGS3 (SLAC), MCNP(Oak Ridge) • Small Fortran IV programs (1000 LOC, 50 kbytes) • Punched cards, line printers, pen plotters (GD3) • Small archive libraries (cernlib), lib.a R.Brun : Evolution of HEP software
Software in 1974 • First “Large Electronic Experiments” • Data Handling Division == Track Chambers • Well organized software in TC with HYDRA, Thresh, Grind, anarchy elsewhere • HBOOK: from 3 routines to 100, from 3 users to many • First software group in DD R.Brun : Evolution of HEP software
GEANT1 in 1975 • Very basic framework to drive a simulation program, reading data cards with FFREAD, step actions with GUSTEP, GUNEXT, apply mag-field (GUFLD). • Output (Hist/Digits) was user defined • Histograms with HBOOK • About 2,000 LOC R.Brun : Evolution of HEP software
ZBOOK in 1975 • Extraction of the HBOOK memory manager in an independent package. • Creation of banks and data structures anywhere in common blocks • Machine independent I/O, sequential and random • About 5,000 LOC R.Brun : Evolution of HEP software
GEANT2 in 1976 • Extension of GEANT1 with more physics (e-showers based on a subset of EGS, mult-scattering, decays, energy loss • Kinematics, hits/digits data structures in ZBOOK • Used by several SPS experiments (NA3, NA4, NA10, Omega) • About 10,000 LOC R.Brun : Evolution of HEP software
Problems with GEANT2 • Very successful small framework. • However, the detector description was user written and defined via “if” statements at tracking time. • This was becoming a hard task for large and always evolving detectors (case with NA4 and C.Rubbia) • Many attempts to describe a detector geometry via data cards (a bit like XML), but the main problem was the poor and inefficient detector description in memory. R.Brun : Evolution of HEP software
GEANT3 in 1980 • A data structure (ZBOOK tree) describing complex geometries introduced , then gradually the geometry routines computing distances, etc • This was a huge step forward implemented first in OPAL, then L3 and ALEPH. • Full electromagnetic showers (first based on EGS, then own developments) R.Brun : Evolution of HEP software
Systems in 1980 End user Analysis software 10 KLOC Experiment Software 100 KLOC Libraries HBOOK, Naglib, cernlib 500 KLOC RAM 1 MB OS & fortran 1000 KLOC Tapes CDC, IBM Vax780 R.Brun : Evolution of HEP software
GEANT3 with ZEBRA • ZEBRA was very rapidly implemented in 1983. • We introduced ZEBRA in GEANT3 in 1984. • From 1984 to 1993 we introduced plenty of new features in GEANT3: extensions of the geometry, hadronic models with Tatina, Gheisha and Fluka, Graphics tools. • In 1998, GEANT3 interface with ROOT via the VMC (Virtual Monte Carlo) • GEANT3 has been used and still in use by many experiments. R.Brun : Evolution of HEP software
PAW • First minimal version in 1984 • Attempt to merge with GEP (DESY) in 1985, but take the idea of ntuples for storage and analysis. GEP was written in PL1. • Package growing until 1994 with more and more functions. Column-wise ntuplesin 1990. • Users liked it, mainly once the system was frozen in 1994. R.Brun : Evolution of HEP software
Vectorization attempts • During the years 1985->1990 a big effort was invested in vectorizing GEANT3 (work in collaboration with Florida State University) on CRAY/YMP, CYBER205,ETA10. • The minor gains obtained did not justify the big manpower investment. GEANT3 transport was still essentially sequential and we had a big overhead with vectors creation, gather/scatter. • However this experience and failure was very important for us and many messages useful for the design of GEANT5 many years later. R.Brun : Evolution of HEP software
Parallelism in the 80s & early 90s • Many attempts (all failing) with parallel architectures • Transputers and OCCAM • MPP (CM2, CM5, ELXI,..) with OpenMP-like software • Too many GLOBAL variables/structures with Fortran common blocks. • RISC architectures or emulators perceived as a cheaper solution in the early 90s. • Then MPPs died with the advent of the Pentium Pro (1994) and farms of PCs or workstations. R.Brun : Evolution of HEP software
1992: CHEP Annecy • Web, web, web, web………… • Attempts to replace/upgrade ZEBRA to support/use F90 modules and structures, but modules parsing and analysis was thought to be too difficult. • With ZEBRA the bank description was within the bank itself (just a few bits). A bank was typically a few integers followed by a dynamic array of floats/doubles. • We did not realize at the time that parsing user data structures was going to be a big challenge!! R.Brun : Evolution of HEP software
Consequences • In 1993/1994 performance was not anymore the main problem. • Our field invaded by computer scientists. • Program design, object-oriented programming , move to more sexy languages was becoming a priority. • The “goal” was thought less important than the “how” • This situation deteriorates even more with the death of the SSC. R.Brun : Evolution of HEP software
1993: Warning Danger • 3 “clans” in my group • 1/3 pro F90 • 1/3 pro C++ • 1/3 pro commercial products (any language) for graphics, User Interfaces, I/O and data bases • My proposal to continue with PAW, develop ZOO(ZEBRA Object-Oriented) and GEANT3 geometry in C++ is not accepted. • EvolutionvsRevolution R.Brun : Evolution of HEP software
1995: roads for ROOT • The official line was with GEANT4 and Objectivity, not much room left for success with an alternative product when you are alone. • The best tactic had to be a mixture of sociology , technicalities and very hard work. • Strong support from PAW and GEANT3 users • Strong support from HP (workstations + manpower) • In November we were ready for a first ROOT show • Java is announced (problem?) R.Brun : Evolution of HEP software
1998: work & smile • RUN II projects at FNAL • Data Analysis and Visualization • Data Formats and storage • ROOT competing with HistoScope, JAS, LHC++ • CHEP98 (September) Chicago • ROOT selected by FNAL, followed by RHIC • Vital decision for ROOT • But official support at CERN only in 2002 R.Brun : Evolution of HEP software
ROOT evolution • No time to discuss the creation/evolution of the 110 ROOT shared libs/packages. • ROOT has gradually evolved from a data storage, analysis and visualization system to a more general software environment replacing totally what was known before as CERNLIB. • This has been possible thanks to MANY contributors from experiments, labs or people working on other fields. • ROOT6 coming soon includes a new interpret CLING and supports all the C++11 features R.Brun : Evolution of HEP software
Input/Output: Major Steps User written streamers filling TBuffer member-wise streaming for STL collections<T*> streamers generated by rootcint TreeCache automatic streamers from dictionary with StreamerInfos in self-describing files parallel merge member-wise streaming for TClonesArray R.Brun : Evolution of HEP software
GEANT4 Evolution • GEANT4 is an important software tool for current experiments with more and more physics improvements and validation procedures. • However, the GEANT4 transport system is not any more suitable for parallel architectures. Too many changes are required. • GEANT5: keep the Geant4 physics and a radically new transport system. R.Brun : Evolution of HEP software
Tools & Libs Geant4+5 geant4 geant3 geant1 geant2 bos minuit hbook Root 1,2,3,4,5,6 paw zbook zebra hydra R.Brun : Computing in HEP
Systemstoday End user Analysis software Networks 10 Gbit/s 0.1 MLOC Disks 1o PB Experiment Software 4MLOC Frameworks like ROOT, Geant4 5MLOC CLOUDS RAM 16 GB OS & compilers 20 MLOC GRIDS Hardware Hardware Hardware Hardware Clusters of multi-core machines 10000x8 R.Brun : Evolution of HEP software
Systems in 2025 ? End user Analysis software Networks 100 Gbit/s 0.2MLOC Networks 100 Gbit/s Networks 10 Tbit/s Experiment Software 10MLOC Disks 1o00 PB Frameworks like ROOT, Geant5 10MLOC CLOUDS on demand RAM 10 TB OS & compilers 40 MLOC GRIDS Hardware Hardware Multi-level parallel machines 10000x1000x1000 Hardware Hardware R.Brun : Evolution of HEP software
BUT !!!!! • It looks like the amount of money devoted to computing is not going to increase with the same slope as it used to increase in the past few years. • The Moore’s law does not apply anymore for one single processor. • However, the Moore’s law looks still OK when looking at the amount of computing delivered/$, € when REALLY using parallel architectures. • Using these architectures is going to be a big challenge, but we do not have the choice!!!! R.Brun : Evolution of HEP software
Software and Hardware • GRIDs/Clouds are inherently parallel. However, because the hardware has been relatively cheap, GRIDs have pushed towards job-level parallelism at the expense of parallelism within one job. • It is not clear today what will be the winning hardware systems: supercomputer?, walls of cores with accelerators?, zillions of ARM-like systems?,.. • Our software must be upgraded keeping in mind all these possible solutions. A big challenge! R.Brun : Evolution of HEP software
Expected Directions • Parallelism: Today we do not exploit well the existing hardware (0.6 instructions/cycle in average) because our code was designed “sequential”. Important gains foreseen (10?), eg in detector simulation. • Automatic Data Caches: Many improvements are required to speed-up and simplify skimming procedures and data analysis. R.Brun : Evolution of HEP software
Data caches • More effort is required to simplify the analysis of large data sets (typically ROOT Trees). • When zillions of files are distributed in Tiers1/2, automatic, transparent, performing, safe caches are becoming mandatory on Tiers2/3 or even laptops. • This must be taken into account in the dilemma: sending jobs to data or vice-versa. • This will require changes in ROOT itself and in the various data handling or parallel file systems. R.Brun : Evolution of HEP software
Parallelism: key points Minimize the sequential/synchronization parts (Amdhallaw): Verydifficult Run the same code (processes) on all cores to optimize the memory use (code and read-only data sharing) Job-levelisbetterthanevent-levelparallelism for offline systems. Use the good-oldprinciple of data localityto minimize the cache misses. Exploit the vectorcapabilitiesbut becarefulwith the new/delete/gather/scatterproblem Reorganizeyour code to reducetails R.Brun : Evolution of HEP software
Data Structures & parallelism C++ pointers specific to a process event event vertices Copying the structure implies a relocation of all pointers I/O is a nightmare tracks Update of the structure from a different thread implies a lock/mutex R.Brun : Evolution of HEP software
Data Structures & Locality sparse data structures defeat the system memory caches For example: group the cross-sections for all processes per materialinstead of all materials per process Group objectelements/collections suchthat the storage matches the traversalprocesses R.Brun : Evolution of HEP software
Create Vectors& exploit Locality • By making vectors , you optimize the instruction cache (gain >2) and data cache (gain >2) • By making vectors, you can use the built-in pipeline instructions of existing processors (gain >2) • But, there is no point in making vectors if your algorithm is still sequential or badly designed for parallelism, eg: • Too many threads synchronization points (Amdhal) • Vectors gather/scatter R.Brun : Evolution of HEP software
Conventional Transport T2 o Eachparticletrackedstep by stepthroughhundreds of volumes o o o o o o o o o o o o T4 o o o o o o o T1 o o o o o o o o o o o o o o o o o o o when all hits for all tracks are in memory summable digits are computed o o o T3 LPCC workshop Rene Brun
Analogywith car traffic LPCC workshop Rene Brun
New Transport Scheme T2 o o o o All particles in the same volume type are transported in parallel. Particles entering new volumes or generated are accumulated in the volume basket. o o o o o o o o o T4 o o o o o o o T1 o o o o o o o o o o o o o o o o o Events for which all hits are available are digitized in parallel o o o o o T3 LPCC workshop Rene Brun
TowardsParallel Software • A long way to go!! • There is no point in justmakingyour code thread-safe. Use of parallel architectures requires a deeprethinking of the algorithms and dataflow. • One suchprojectis GEANT GEANT4+5launched 2 yearsago. Westarthavingveryniceresults. But still a long way to go to adapt (or write radically new software) for the emerging parallel systems. R.Brun : Evolution of HEP software
A global effort • Software development is nowadays a world-wide effort with people scattered in many labs developing simulation, production or analysis code. • It remains a very interesting area for new people not scared by big challenges. • I had the fantastic opportunity to work for many decades in the development of many general tools in close cooperation with many people to whom I am very grateful. R.Brun : Evolution of HEP software