1 / 24

An Introduction of GridMPI

Introduction to the GridMPI project focusing on high-performance communication facilities, protocols, and benchmark evaluations for MPI applications in metropolitan and wide-area network environments.

ggeorge
Download Presentation

An Introduction of GridMPI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction of GridMPI http://www.gridmpi.org/ (1,2) (2) Yutaka Ishikawaand Motohiko Matsuda University of Tokyo Grid Technology Research Center, AIST (National Institute of Advanced Industrial Science and Technology) (1) (2) This work is partially supported by the NAREGI project Yutaka Ishikawa, The University of Tokyo

  2. Motivation • MPI, Message Passing Interface, has been widely used to program parallel applications. • Users want to run such applications over the Grid environment without any modifications of the program. • However, the performance of existing MPI implementations is not scaled up on the Grid environment. computing resource site A computing resource site B Wide-area Network Single (monolithic) MPI application over the Grid environment Yutaka Ishikawa, The University of Tokyo

  3. Motivation • Focus on metropolitan-area, high-bandwidth environment: 10Gpbs,  500miles (smaller than 10ms one-way latency) • We have already demonstrated that the performance of the NAS parallel benchmark programs are scaled up if one-way latency is smaller than 10ms using an emulated WAN environment . Motohiko Matsuda, Yutaka Ishikawa, and Tomohiro Kudoh, ``Evaluation of MPI Implementations on Grid-connected Clusters using an Emulated WAN Environment,'' CCGRID2003, 2003 computing resource site A computing resource site B Wide-area Network Single (monolithic) MPI application over the Grid environment Yutaka Ishikawa, The University of Tokyo

  4. High Performance Communication Facilities for MPI on Long and Fat Networks TCP vs. MPI communication patterns Network Topology Latency and Bandwidth Interoperability Most MPI library implementations use their own network protocol. Fault Tolerance and Migration To survive a site failure Security Issues Internet Yutaka Ishikawa, The University of Tokyo

  5. High Performance Communication Facilities for MPI on Long and Fat Networks TCP vs. MPI communication patterns Network Topology Latency and Bandwidth Interoperability Many MPI library implementations. Most implementations use their own network protocol. Fault Tolerance and Migration To survive a site failure Security Using Vendor B’s MPI library Using Vendor A’s MPI library Using Vendor D’s MPI library Using Vendor C’s MPI library Issues Internet Yutaka Ishikawa, The University of Tokyo

  6. YAMPII VendorMPI GridMPI Features • MPI-2 implementation • IMPI (Interoperable MPI) protocol and extension for Grid • MPI-2 • New Collective protocols • Checkpoint • Integration of Vendor MPI • IBM, Solaris, Fujitsu, and MPICH2 • High Performance TCP/IP implementation on Long and Fat Networks • Pacing the transmission ratio so that the burst transmission is controlled according to the MPI communication pattern. • Checkpoint Cluster X Cluster Y IMPI Yutaka Ishikawa, The University of Tokyo

  7. Four 1000Base-SX ports • One USB port for Host PC • FPGA (XC2V6000) Evaluation • It is almost impossible to reproduce the execution behavior of communication performance in the wide area network. • A WAN emulator, GtrcNET-1, is used to scientifically examine implementations, protocols, communication algorithms, etc. GtrcNET-1 • GtrcNET-1 is developed at AIST. • injection of delay, jitter, error, … • traffic monitor, frame capture http://www.gtrc.aist.go.jp/gnet/ Yutaka Ishikawa, The University of Tokyo

  8. WAN Emulator Node0 Node8 GtrcNET-1 Host 0 Host 0 Host 0 Host 0 Host 0 Host 0 Catalyst 3750 Catalyst 3750 Experimental Environment 8 PCs 8 PCs ……… ……… Node15 Node7 • Bandwidth:1Gbps • Delay: 0ms -- 10ms • CPU: Pentium4/2.4GHz, Memory: DDR400 512MB • NIC: Intel PRO/1000 (82547EI) • OS: Linux-2.6.9-1.6 (Fedora Core 2) • Socket Buffer Size: 20MB Yutaka Ishikawa, The University of Tokyo

  9. GridMPI vs. MPICH-G2 (1/4) FT (Class B) of NAS Parallel Benchmarks 3.2 on 8 x 8 processes Relative Performance One way delay (msec) Yutaka Ishikawa, The University of Tokyo

  10. GridMPI vs. MPICH-G2 (2/4) IS (Class B) of NAS Parallel Benchmarks 3.2 on 8 x 8 processes Relative Performance One way delay (msec) Yutaka Ishikawa, The University of Tokyo

  11. GridMPI vs. MPICH-G2 (3/4) LU (Class B) of NAS Parallel Benchmarks 3.2 on 8 x 8 processes Relative Performance One way delay (msec) Yutaka Ishikawa, The University of Tokyo

  12. GridMPI vs. MPICH-G2 (4/4) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes Relative Performance No parameters tuned in GridMPI One way delay (msec) Yutaka Ishikawa, The University of Tokyo

  13. Relative performance Pentium-4 2.4GHz x 8 connected by 1G Ethernet @ Tsukuba Pentium-4 2.8 GHz x 8 Connected by 1G Ethernet @ Akihabara Benchmarks JGN2 Network 10Gbps Bandwidth 1.5 msec RTT 60 Km (40mi.) GridMPI on Actual network • NAS Parallel Benchmarks run using 8 node (2.4GHz) cluster at Tsukuba and 8 node (2.8GHz) cluster at Akihabara • 16 nodes • Comparing the performance with • result using 16 node (2.4 GHz) • result using 16 node (2.8 GHz) Yutaka Ishikawa, The University of Tokyo

  14. Pentium-4 2.4GHz x 8 connected by 1G Ethernet @ Tsukuba Pentium-4 2.8 GHz x 8 Connected by 1G Ethernet @ Akihabara JGN2 Network 10Gbps Bandwidth 1.5 msec RTT 60 Km (40mi.) Demonstration • Easy installation • Download the source • Make it and set up configuration files • Easy use • Compile your MPI application • Run it ! Yutaka Ishikawa, The University of Tokyo

  15. Grid-Enabled Nano-Applications Grid PSE Grid Visualization Grid Programing -Grid RPC -Grid MPI Grid Workflow Distributed Information Service Super Scheduler Data (Globus,Condor,UNICOREOGSA / WSRF) Grid VM High-Performance & Secure Grid Networking NAREGI Software Stack (Beta Ver. 2006) Yutaka Ishikawa, The University of Tokyo

  16. GridMPI Current Status http://www.gridmpi.org/ • GridMPI version 0.9 was released • MPI-1.2 features are fully supported • MPI-2.0 features are supported except for MPI-IO and one sided communication primitives • Conformance Tests • MPICH Test Suite: 0/142 (Fails/Tests) • Intel Test Suite: 0/493 (Fails/Tests) • GridMPI version 1.0 will be released in this Spring • MPI-2.0 fully supported Yutaka Ishikawa, The University of Tokyo

  17. Concluding Remarks • GridMPI is integrated into the NaReGI package. • GridMPI is not only for production but also our research vehicle for Grid environment in the sense that the new idea in Grid is implemented and tested. • We are currently studying high-performance communication mechanisms in the long and fat network: • Modifications of TCP Behavior • M Matsuda, T. Kudoh, Y. Kodama, R. Takano, and Y. Ishikawa, “TCP Adaptation for MPI on Long-and-Fat Networks,” IEEE Cluster 2005, 2005. • Precise Software Pacing • R. Takano, T. Kudoh, Y. Kodama, M. Matsuda, H. Tezuka, Y. Ishikawa, “Design and Evaluation of Precise Software Pacing Mechanisms for Fast Long-Distance Networks”, PFLDnet2005, 2005. • Collective communication algorithms with respect to network latency and bandwidth. Yutaka Ishikawa, The University of Tokyo

  18. BACKUP Yutaka Ishikawa, The University of Tokyo

  19. MPI API IMPI LACT Layer (Collectives) RPIM Interface Request Interface Request Layer P2P Interface ssh rsh SCore Globus Vendor MPI IMPI TCP/IP Vendor MPI PMv2 MX O2G GridMPI Version 1.0 • YAMPII, developed at the University of Tokyo, is used as the core implementation • Intra communication by YAMPII(TCP/IP、SCore) • Inter communication by IMPI(TCP/IP) Yutaka Ishikawa, The University of Tokyo

  20. GridMPI vs. Others (1/2) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes Relative Performance One way delay (msec) Yutaka Ishikawa, The University of Tokyo

  21. Relative Performance GridMPI vs. Others (1/2) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes Yutaka Ishikawa, The University of Tokyo

  22. Relative Performance GridMPI vs. Others (2/2) NAS Parallel Benchmarks 3.2 Class B on 8 x 8 processes Yutaka Ishikawa, The University of Tokyo

  23. GridMPI vs. Others NAS Parallel Benchmarks 3.2 16 x 16 Relative Performance Yutaka Ishikawa, The University of Tokyo

  24. GridMPI vs. Others NAS Parallel Benchmarks 3.2 Relative Performance Yutaka Ishikawa, The University of Tokyo

More Related