60 likes | 226 Views
Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation -. Grid Technology Research Center AIST Hiroshi Takemiya, Yoshio Tanaka. v=0.009 Å/fs. 525fs. 40Å. Goal of the experiment.
E N D
Status report on thelarge-scale long-run simulation on the grid- Hybrid QM/MD simulation - Grid Technology Research Center AIST Hiroshi Takemiya, Yoshio Tanaka
v=0.009 Å/fs 525fs 40Å Goal of the experiment • To verify the effectiveness of our programming approach for large-scale long-run grid applications • Flexibility • Robustness • efficiency • Friction simulation • Nano-scale prober moves on the Si substrate • Requiring hundreds of CPUs • Requiring long simulation time over a few months • No. of QM regions and No. of QM atoms change dynamically Gridifying the application Using GridRPC + MPI • < Initial condition> • 2 QM regions with • 72 + 41 QM atoms • Totally 28598 atoms
Testbed for the Friction Simulation • Used 11 clusters with totally 632 CPUs in 8 organizations. • PRAGMA Clusters • SDSC (32 CPUs), KU (8 CPUs), NCSA (8 CPUs), NCHC (8 CPUs) • Titech-1(8 CPUs), AIST(8 CPUs) • AIST Super Cluster • M64 (128 CPUs), F32-1(128 CPUs + 128 CPUs) • Japan Clusters • U-Tokyo (128 CPUs), Tokushima-U (32 CPUs), Titech-2 (16 CPUs) F32 M64 NCSA KU U-Tokyo SDSC AIST Titech-2 NCHC Titech-1 Tokushima-U
Result of the Friction Simulation • Experiment Time: 52. 5 days • Longest Calculation Time: 22 day • Manual restart: 2 times • Execution failure: 165 times • Succeeded in recovering these failures • Changing the No. of CPUs used: 18 times • succeeded in adjusting No. of CPUs to the No. of QM regions/QM atoms
Summary and future work • Our approach is effective for running large-scale grid applications for a long time • Need more grid services • Getting information on available resources • Resource reservation • Coordinating with resource manager/scheduler • Need “cleaner” MPI • mpich quits leaving processes/IPC resources • Using GridMPI in place of mpich