40 likes | 55 Views
Interface HFBTHO/HFODD and Comments on Parallelization. UTK-ORNL DFT group. HFBTHO Cylindrical HO Basis Axial symmetry and time-reversal symmetry. HFODD Cartesian HO basis Symmetry unrestricted. Interface HFBTHO/HFODD. Principle Unitary transformation Cylindrical to Cartesian
E N D
Interface HFBTHO/HFODDandComments on Parallelization UTK-ORNL DFT group
HFBTHO Cylindrical HO Basis Axial symmetry and time-reversal symmetry HFODD Cartesian HO basis Symmetry unrestricted Interface HFBTHO/HFODD • Principle • Unitary transformation Cylindrical to Cartesian • Phase transformation • Tweak HFODD to restart from HFB matrix elements instead of density fields on Gauss-Hermite mesh • New HFODD and MPI_HFODD versions with HFBTHO as a module called (upon request) in initial stage • Automatic restart of HFODD • I/O required: HFB matrix + basis quantum numbers written/read on disk • Open Issues: • too much memory required for large (N ≥ 18 shells) deformed bases • more tests for odd nuclei
MPI_HFODD • Master-slave architecture: master defines a list of task, distributes the tasks to the slaves available (until list is empty) and collect the results • Compiles/runs with Intel Fortran, GNU Fortran, Portland and PathScale compilers • Can run on your laptop…! (most modern laptops are dual/quad cores) Super Computers: Increase number of cores at fixed memory Available memory per core is decreasing ! • 90% of CPU-time taken by only two subroutines: • DENSHF (calculation of fields on Gauss-Hermite mesh) • DIAMAT (diagonalization of HFB matrix
Future of HFODDApplications on Leadership Class computers • Good practice in programming • Remember: memory is expensive, CPU-time is fast and cheap • Use Fortran 90 for dynamic memory allocation • Avoid vectorization and think parallelization instead Example: Takagi factorization should decrease by ~4 the memory needed • Future work: • Diagonalization of the HFB matrix can be parallelized “relatively” simply by the use of threading and ARPACK or ScaLAPACK specialized routines • Parallelization of density fields is more tricky • Include HFODD - MPI_HFODD in optimization codes • Asynchronous Dynamic Load Balancing (ADLB – UNEDF project): dynamic stack. List of task is updated on the fly based on results