1 / 12

SC2003 High-Performance Bandwidth Challenge Participant Trans-Pacific Grid Datafarm

SC2003 18 Nov, 2003 Phoenix, US. SC2003 High-Performance Bandwidth Challenge Participant Trans-Pacific Grid Datafarm.

tao
Download Presentation

SC2003 High-Performance Bandwidth Challenge Participant Trans-Pacific Grid Datafarm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SC2003 18 Nov, 2003 Phoenix, US SC2003 High-Performance Bandwidth Challenge ParticipantTrans-Pacific Grid Datafarm Osamu Tatebe, Hirotaka Ogawa, Yuetsu Kodama, Tomohiro Kudoh, Satoshi Sekiguchi (AIST), Satoshi Matsuoka, Kento Aida (Titech), Taisuke Boku, Mitsuhisa Sato (Univ Tsukuba),Youhei Morita (KEK), Yoshinori Kitatsuji (APAN Tokyo XP), Jim Williams, John Hicks (TransPAC/Indiana Univ)

  2. Points of the challenge • Trans-Pacific file replication using Grid Datafarm • 6000 miles, 1.5 TBytes scientific data (Astronomy and Lattice QCD) • Multiple high-speed Trans-Pacific networks; APAN/TransPAC (2.4 Gbps OC48 POS, 500 Mbps OC-12 ATM), SuperSINET (2.4 Gbps x 2, 1 Gbps available) • Jumbo frame • Aims at stable3.7 Gbps network flow out of 3.9 Gbps (95% efficiency), record speed for *publicly* available Trans-Pacific links

  3. Background and key technology • [Disk I/O performance] Grid Datafarm – A Grid file system with high-performance data-intensive computing support • A world-wide virtual file system that federates local file systems of multiple clusters • It provides scalable disk I/O performance for file replication via high-speed network links and large-scale data-intensive applications • Trans-Pacific Grid Datafarm testbed • 5 clusters in Japan, 3 clusters in US, and 1 cluster in Thailand, provides 70 TBytes disk capacity, 13 GB/sec disk I/O performance • It supports file replication for fault tolerance and access-concentration avoidance • [World-wide high-speed network efficient utilization] GNET-1 – a gigabit network testbed • Enables stable and efficient Trans-Pacific network use of HighSpeed TCP by precise IFG-based rate-controlled flow

  4. /grid ggf jp aist gtrc file1 file2 file2 file1 file3 file4 Grid Datafarm (1): Gfarm file system - World-wide virtual file system [CCGrid 2002] • Transparent access to dispersed file data in a Grid • POSIX I/O APIs, and native Gfarm APIs for extended file view semantics and replications • Map from virtual directory tree to physical file • Automatic and transparent replica access for fault tolerance and access-concentration avoidance Virtual Directory Tree File system metadata mapping File replica creation Gfarm File System

  5. Grid Datafarm (2): High-performance data access and processing support [CCGrid 2002] • World-wide parallel and distributed processing • Aggregate of files = superfile • Data processing of superfiles = parallel and distributed data processing of member files • Local file view • File-affinity scheduling World-wide Parallel & distributed processing Virtual CPU Grid File System Astronomic archival data in a year (superfile) 365 parallel analysis

  6. gfwhere – inquiry of replica catalog gfrep – parallel third-party file replication Target domain name Target host list Example: Gfarm replication command % gfwhere gfarm:host30: hpc01.hpcc.jp1: hpc02.hpcc.jp2: hpc03.hpcc.jp % gfrep -D apgrid.org gfarm:host3% gfwhere gfarm:host30: hpc01.hpcc.jp gfm01.apgrid.org1: hpc02.hpcc.jp gfm02.apgrid.org2: hpc03.hpcc.jp gfm03.apgrid.org

  7. Trans-Pacific Gfarm Datafarm testbed:Network and cluster configuration SuperSINET Trans-Pacific thoretical peak 3.9 Gbps Gfarm disk capacity 70 TBytes disk read/write 13 GB/sec Indiana Univ Titech 147 nodes 16 TBytes 4 GB/sec 10G SuperSINET NII SC2003 Phoenix 2.4G Abilene Univ Tsukuba 2.4G New York 10G 10 nodes 1 TBytes 300 MB/sec 2.4G(1G) KEK 10G 7 nodes 3.7 TBytes 200 MB/sec 1G Chicago OC-12 ATM 500M 10G 10G APAN/TransPAC 1G 32 nodes 23.3 TBytes 2 GB/sec APAN Tokyo XP AIST Los Angeles Maffin 2.4G 10G 1G+1G SDSC 5G Tsukuba WAN 16 nodes 11.7 TBytes 1 GB/sec 16 nodes 11.7 TBytes 1 GB/sec Kasetsert Univ, Thailand

  8. Scientific Data for Bandwidth Challenge • Trans-Pacific File Replication of scientific data • For transparent, high-performance, and fault-tolerant access • Astronomical Object Survey on Grid Datafarm [HPC Challenge participant] • World-wide data analysis on whole the archive • 652 GBytes data observed by SUBARU telescope • N. Yamamoto (AIST) • Large configuration data from Lattice QCD • Three sets of hundreds of gluon field configurations on a 24^3*48 4-D space-time lattice (3 sets x 364.5 MB x 800 = 854.3 GB) • Generated by the CP-PACS parallel computer at Center for Computational Physics, Univ. of Tsukuba (300Gflops x years of CPU time) • [Univ Tsukuba Booth]

  9. Packet discarding problem for long fat networks • [Problem] Congestion window based rate control cannot utilize maximum performance RTT: 200 ms 1 Gbps 200 Mbps Router Because the flow is not stable… Packet loss Packet loss zone [in 10-sec average] [in 16-msec average]

  10. A network test-bed GNET-1

  11. Bottleneck GNET-1 IFG-based rate control by GNET-1 • GNET-1 provides these features: • Accommodation of non-stable traffic using large input buffer (16MB) • Precise traffic shaping by IFG (Inter-Frame Gap) adjustment NO PACKET LOSS! RTT RTT RTT

  12. Special thanks to • Eguchi Hisashi (Maffin), Kazunori Konishi, Jin Tanaka, Yoshitaka Hattori (APAN), Jun Matsukata (NII), Chris Robb (Abilene), Force10 Networks • PRAGMA, ApGrid, SDSC, Indiana University, Kasetsart University

More Related