10 likes | 141 Views
Parallelization of M onte Carlo simulations of spallation experiments. *) Electronic address: majerle@ujf.cas.cz 1) Nuclear Physics Institute of the Academy of Science s of the Czech Republic, Prague. Mitja Majerle 1* , Josef Hampl 1. Accelerator Driven Systems
E N D
Parallelization of Monte Carlo simulations of spallation experiments *) Electronic address: majerle@ujf.cas.cz 1) Nuclear Physics Institute of the Academy of Sciences of the Czech Republic, Prague Mitja Majerle1*, Josef Hampl1 Accelerator Driven Systems ADTT (Accelerator Driven Transmutation Technologies, or ADS – Accelerator Driven Systems) are considered to be a promising option for future nuclear energy production and nuclear waste incineration. ADS produce a large number of neutrons in the spallation process, and introduce them into a sub-critical reactor assembly, where they produce energy by fission of fissionable material. Extra neutrons are used to produce fuel from 232Th, and/or to transmute long-lived nuclear waste to stable isotopes, or to short-lived isotopes which decay to stable ones. ADS are so far in the early experimental and design stage. Before the construction of the first full scale ADS unit, a lot of research in this field has to be done. Details of the spallation reaction and its description in simulation codes, as well as the cross-sections of wide range of nuclear reactions are being studied worldwide. Spallation experiments at the Joint Institute for Nuclear Research in Dubna History: The experiments with simplified ADS setups were performed on accelerators at the JINR (Joint Institute for Nuclear Research, Dubna, Russia). The targets were lead, tungsten, and uranium cylinders (in some experiments surrounded with paraffin moderator or natural U blanket). The physical aim of the experiments was to study nuclear processes in the target, to inquire into the process of transmutation in radioactive samples, to measure the heat production, etc. The neutronics of the experimental setups is studied by several types of small detectors placed around and inside the setup. Two examples of this work are the setups Energy plus Transmutation and Gamma-MD. Energy plus Transmutation Energy plus Transmutation is an international project that studies the neutron production and radioactive isotope transmutation in a system of massivelead target and natural uranium blanket. Lead target is in the form of a cylinder, which is half a meter long and has 8cm in diameter. It is surrounded with a natural uranium blanket (total weight of the uranium is 206 kg), which is in the form of small rods coated with aluminum. The setup is placed in a biological shielding consisting of a wooden box filled with granulated polyethylene, inner walls of the shielding are coated with cadmium. This setup was irradiated in several experiments by protons and deuterons in the energy range 0.7-2.52 GeV. Gamma-MD Gamma series of experiments are focused to the studies of production of neutrons in the spallation process and their moderation/transport in the neutron moderator. The first such experiment (Gamma-2) was performed on 20 cm long lead target surrounded with 6 cm thick paraffin moderator. 660 MeV protons were directed to the target. Extended target of lead (diameter 8 cm, length 60 cm) is used in the setup Gamma-MD. The moderator is a block of graphite (1.1x1.1x0.6m3). This setup was by now irradiated with 2.33 GeV deuterons, more irradiations are planned. ANITA neutron facility ANITA (Atmospheric-like Neutrons from thIck TArget) is a neutron facility for accelerated testing of electronics for neutron-induced single-event effects (SEE). SEE cause one of the major reliability concerns in semiconductor electronic components and systems. The facility utilizes accelerated proton beam (0.18 GeV), guided to a tungsten target. The neutron beam, originating from nuclear reactions induced by protons in tungsten, is formed geometrically by a collimator aperture in a frontal wall that separates the neutron production cave from the user area. The features of the facility include high neutron flux, fidelity of the neutron spectrum with regard to the natural environment, flexible size of the beam spot, user flux control, low dose rate from g-rays, low thermal neutron flux, spacious user area, and on-line neutron dosimetry. The facility, installed at The Svedberg Laboratory, Uppsala, Sweden, is frequently used by leading manufacturers of semiconductor electronic components and systems. Further information about the facility and the laboratory can be found at www.tsl.uu.se . Calculation principles Important part of every experiment preparation and analysis is setup simulation. The described systems are implemented in the Monte Carlo codes. Nuclear interactions and subsequent particle transport are simulated and several quantities are sampled, such as number of produced secondary particles, fluxes at different places around setup … Two different codes are used - MCNPX and FLUKA – and the time to simulate one event is usually from 10-4 - 0.1s for described setups, depending on the complexity, used calculation models and libraries. The sampled phase space is also very small and no variance reduction techniques can be used in most cases, therefore, statistically reliable results starts with ca. 107 simulated events. It was essential to introduce parallelization in the simulations to get results in reasonable time. Parallelization of simulations MCNPX In the MCNPX code version 2.4.0 the parallelization through PVM (Parallel Virtual Machine) was implemented. PVM version of MCNPX 2.4.0 was running successfully on our 2 computing servers (each had two cores - Xeon 3 GHz), it was also used on Beowulf-like cluster connecting the personal computers from our offices on weekends. The release of MCNPX 2.5.0 code had implemented MPI (Message Passing Interface), PVM was deprecated. MPICH2 flavor of MPI was running on our Xeon servers and Beowulf-like cluster. In 2007, we bought 2 computing servers, each with four 64-bit cores (Opteron 280, 1.8 GHz). We moved to OPENMPI flavor of MPI (MPICH2 did not work), which was easier and more reliable to use. From the end of 2008, we have 8 computing servers, each with 8 cores (Xeon E5420, 2.5 GHz) = 64 cores. OPENMPI with the latest beta version of MCNPX is running on it. FLUKA FLUKA code does not come with any implementation of parallelization, but is designed in the way to be easily used in PBS (Portable Batch System) environments. The tools to combine the results from independent runs are available with the source code (generation of starting random seeds for independent runs ?). We started to use FLUKA in parallel on CESNET MetaCentrum computers in PBSPro environment. On our 8 servers, we installed TORQUE PBS environment, which is a free alternative to PBSPro. Setup length is ca 0.5 m. Particle fluxes are sampled on cm scale (the size of used detectors) with the energy binning 1 Mev in the range 0-few GeV. TESTS • MPI TESTS • OPENMPI on 8 machines with 8 cores was tested • MCNPX is the only program that uses MPI • Results: • The effectiveness of the cluster drops with the number of cores used (communication between cores). • The computational power of our computers increased for a factor of 25 from 2004, which is higher than Moore’s law prediction of 5.6. • COMPUTING RESOURCES ON NPI SPECTROSCOPY DEPARTMENT • 2004: • 2 two core servers = 4*Xeon 3 GHz • MCNPX with PVM, later MPICH2 • Beowulf-like cluster from office PCs • MCNPX with PVM, later MPICH2 • 2007: • 2 four core servers = 8*Opteron 280 1.8 GHz • 64-bit MCNPX with OPENMPI • FLUKA • Access to METACENTRUM computing facilities • 2008: • 8 eight core servers = 64*Intel Xeon E5420 2.5 GHz • 64-bit MCNPX with OPENMPI • FLUKA with TORQUE PBS • Comparison of MCNPX • calculation performance • YEAR COMPUTER SPEED • 2004 2 two core servers 4 • Beowulf-like cluster 4 • 2007 2 four core servers 11 • METACENTRUM 80 • 2009 8 eight core servers 100 [kSpecINT2k] • BEOWULF-LIKE CLUSTER • During weekends, office PCs were used for calculations • Up to 6 Pentium 4 computers were available • The same OS as on servers was installed on 1 computer • Others computers booted through Etherboot • Shared user file system mounted through NFS (Network File System), all programs compiled on Xeon servers work (same OS and architecture) • Speed was roughly the same as of four Xeon processors: • Total CPU power is higher, but … • Loss because of load balancing • Slower I/O buses • Inspiration and help for the cluster came from the documentation of “ECHELON BEOWULF CLUSTER” • PROCESSOR TESTS • 12 different cores available from METACENTRUM and NPI were tested • Tests are adapted for core (not disk/bus) performance: • POVRAY rendering • Movie encoding with MENCODER • FLUKA simulation of physical system • MCNPX simulation of physical system • Results: • From 2004 to 2009, the core performance increased for a factor of ~2, not 5.6 according to Moore’s law (2x every two years). • The first three tests are consistent: • POVRAY, MPLAYER and FLUKA compiled with GCC, G77 compilers; • Last test (MCNPX) gives different values: • MCNPX is compiled with Intel compilers • Intel compilers are tuned for Intel processors • MCNPX compiled with Portland Group compilers is ~2 times slower than MCNPX compiled with Intel compilers • 1 kSPECINT2k ~ Xeon 3.06 GHz • METACENTRUM (http://meta.cesnet.cz) • 8 cores for 2 hours, queue “short” • 32 cores for 24 hours in queue “normal” • 40 cores for 1 month, queue “long”, but rarely free • Very valuable are first experiences in PBS environment: • Running FLUKA in PBS environment • TORQUE PBS at NPI installed based on these experiences • Tests of different processors available on METACENTRUM computers: • Processors for 8 servers at NPI bought in 2008 were chosen according to these tests