1 / 25

Resolution of large symmetric eigenproblems on a world-wide grid

Resolution of large symmetric eigenproblems on a world-wide grid. Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba 2 nd NEGST workshop at Tokyo May 28-29 th , 2007. Outlines. Introduction Distribution of the numerical method Experiments

yannis
Download Presentation

Resolution of large symmetric eigenproblems on a world-wide grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba 2nd NEGST workshop at Tokyo May 28-29th, 2007

  2. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  3. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  4. Introduction • Huge number of nodes connected to Internet • Clusters and NOWs of institutions,PCs of individual users • Volunteer • Constant availability of nodes, on-demand access • HPC and large Grid Computing are complementary • We do not target the highest performances • We target a different community of users • Why the real symmetric eigenproblem? • Requires a lot of resources on the nodes • Communications, synchronization points • Useful problem • Few similar studies for very large Grid Computing

  5. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  6. Distribution of the numerical method (1/2) • Real symmetric eigenproblem • Au=lu, A real symmetric • Main steps: • Lanczos tridiagonalization • T=QtAQ, T real symmetric tridiagonal • Data accessed by means of MVP • Bisection and Inverse Iteration • Tv=lv, same eigenvalues as A (Ritz eigenvalues) • Communication-free parallelism: task-farming • Ritz eigenvectors computations (u) • Accuracy tests |Au-lu|2<eps

  7. Distribution of thenumerical method (2/2) • Reducing the memory usage • Out-of-core • Restarted scheme • Reorthogonalization • Bisection, Inverse Iteration • Reduces the disk usage too • Volume of communications • Data-persistence (A and Q) • Number of communications • Task-farming • Other issue to be improved • Distribution of A

  8. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  9. World-wide grid experimentsExperimental platforms, numerical settings (1/2) • Computing and network resources • University of Tsukuba • Homogeneous dedicated clusters • Dual Xeon ~3GHz, 1 to 4 GB • University of Lille 1 • Heterogeneous NOWs • Celeron 1.4 GHz to P4 3.2 Ghz • 128MB to 1GB • Shared with students • Internet

  10. World-wide grid experimentsExperimental platforms, numerical settings (2/2) • 4 Platforms • OmniRPC • 2 local platforms: 29 / 58 nodes, Lille • 2 world-wide platforms • 58 (29 Lille+ 29 Tsukuba dual-proc.) • 116 (58 Lille, 58 Tsukuba dual-proc.) • Matrix • N=47792 • 2.5 million elements, avg 48 nnz/row • Parameters • M=10, 15, 20, 25 • K=1, 2, 3, 4

  11. Grid'5000 experimentsPresentation, motivations • Up to 9 sites distributed in France • Dedicated PC with reservation policy • Fast and dedicated Network • RENATER (1GBit/s to 10GBit/s) • PC are homogeneous (few exceptions) • Homogeneous environment • (deployment strategy) • For those experiments • Orsay: up to 300 single-CPU nodes • Lille: up to 60 single-CPU nodes • Nice: up to 60 dual-CPU nodes • Rennes: up to 70 dual-CPU nodes

  12. Grid'5000 experimentsPlatforms and numerical settings (1/2) • Step 1: • Goal: improving previous analysis. • Platforms • 29 Orsay, single-proc • 58 Orsay, single-proc • 58 Lille, Sophia dual-proc • 116 Orsay, Sophia dual-proc (1 core/proc) • + 116 Orsay, Lille, Sophia dual-proc (1 core/proc) • 1 process/dual-processor • Numerical settings • Matrix: N=47792 , 2.5 million elements, avg 48 nnz/row • Parameters • m=10, 15, 20, 25 • k=1, 2, 3, 4

  13. Grid'5000 experimentsPlatforms and numerical settings (2/2) Step 2: Goal: increasing the size of the problem. In progress N=430128, 193 million elements 7 OmniRPC relay nodes, 206 CPU 3 sites 11 OmniRPC relay nodes, 412 CPU 4 sites k=1, m=15

  14. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  15. World-wide grid experimentsResults 29 Sing. Proc. Lille 58 Sing. Proc Lille 58 Sing. Proc. Lille Dual. Proc. Tsukuba (all proc. Used) 116 Sing. Proc. Orsay Dual. Proc. Tsukuba (all proc. Used)

  16. Grid'5000 experiments – step 1Results 29 Sing. Proc. Orsay 58 Sing. Proc Orsay 58 Sing. Proc. Lille Dual. Proc. Sophia (all proc. Used) 116 Sing. Proc. Orsay Dual. Proc. Sophia (all proc. Used) 116 Sing. Proc. Orsay Sing. Proc. Lille Dual. Proc. Sophia (1 proc. Used)

  17. Grid'5000 experiments – step 2Results Details for N=430128, m=15, k=1 Wall-clock times in seconds Number cpu 206 412 Wall-clock time 10962 13150 Lanczos tridiagonalization Send new column of Q: 22 MVP: 10106 Reorthog: 129 Send new column of Q: 20 MVP: 12311 Reorthog: 159 Bisection + Inverse Iteration <1 9 Ritz eigenvector 9 11 |Au-lu| < eps 691 810 • Evaluation of the wall-clock-time for 1 MVP with the matrix A • In the tridiagonalization: • 15(m)*5(nb restarts)=75 MVPs • 134 sec (206 cpu) and 164 sec (412 cpu) per MVP • In the tests of convergence: • 5(nb restarts) MVPs • 138 sec (206 cpu) and 162 sec (412 cpu) per MVP

  18. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  19. Progress of YML • YML 1.0.5 • Stability, error reporting • Collections of data • out-of-core • Variable lists of parameters • Parameters in/out of the Workflow • Mainly developed at the PRiSM laboratory, University of Versailles • http://yml.prism.uvsq.fr/ • Olivier Delannoy, Nahid Emad

  20. Resolution of the eigenproblem with YML • No data persistence • Future work: binary cache • Re-usability / aggregation of components

  21. Experiments with YML & OmniRPC back-end YML + OmniRPC back-end (wall-clock times in min) OmniRPC (wall-clock times in min) • Sources of overhead • No computation in the YvetteML workflow • Sheduler, (un)packing the parameters • Transfers of binaries Overhead (in %)

  22. Outlines • Introduction • Distribution of the numerical method • Experiments • Experiments on world-wide grids: platforms, numerical settings • Experiments on Grid'5000: motivations, platforms, numerical settings • Results • YML • Progress of YML • YvetteML workflow of the real symmetric eigenproblem • First experiments • Conclusion

  23. Conclusion (1/3) • Reminder of the scope of this work • Large grid computing and HPC: complementary tools • Used by people that have no access to HPC • Significant computations (size of the problem) • We do not (cannot) target the high performances • The resources are not dedicated • Slow networks, heterogeneous machines, external perturbations, etc • Linear algebra problems are useful for many general applications • Differences with HPC and cluster computing • We must not have a “speed-up” approach of the computations • Recommendations to save resources on nodes

  24. Conclusion (2/3) • We propose • Scalable real symmetric eigensolver for large grids • Next expected bounding limit: disk space for much larger or very dense matrix • Before the implementation of the method, key choices must be done • Numerical methods and programming paradigms • Bisection (Task-farming) • Restarted scheme (memory and disk) • Out-of-core (memory) • Data persistence (communication) • New version of YML • Workflow of the eigensolver and re-usable components • In progress

  25. Conclusion (3/3) • Topics of study for the eigensolver • Improving the distribution of A • Testing more matrices • Different kind of matrices (e.g. sparse, dense) • Larger matrices • Scheduling level • adapting the workload balancing to the heterogeneity of the platforms • Current and future work on YML • Finishing the multi back-end support • Binary cache

More Related