320 likes | 420 Views
Recent Efforts on the Ninf Project and the Asia-Pacific Grid (ApGrid). Satoshi Matsuoka Tokyo Inst. Technology/JST Matsu@is.titech.ac.jp SEKIGUCHI, Satoshi Electrotechnical Laboratory, AIST(TACC), MITI Sekiguchi@etl.go.jp. Several slides are courtesy of Grid people. What is ApGrid?.
E N D
Recent Efforts on the Ninf Project and the Asia-PacificGrid (ApGrid) Satoshi Matsuoka Tokyo Inst. Technology/JST Matsu@is.titech.ac.jp SEKIGUCHI, Satoshi Electrotechnical Laboratory, AIST(TACC), MITI Sekiguchi@etl.go.jp Several slides are courtesy of Grid people
What is ApGrid? • A meeting point for all Asia-Pacific HPCN researchers..doing grid-related work • Communication channel to the Global Grid Forum, and other grid communities • Pool for finding international project partners • Not a single source funded “project”!
ACSys North America (STARTAP) Europe Latin America APAN: http://apan.net Japan Europe TransPAC (100 Mbps) South Korea China Hong Kong Thailand Philippines Malaysia Singapore Australia-Japan Link (1.5Mbps Frame Relay) Indonesia Exchange Point Access Point Current Status Planned Australia
Success2: Tsukuba Advanced Computing Center (TACC): SC99 HPC Games w/Pittsburgh, Stuttgart, Manchester, etc. APAN TACC
National Backbones for Japaese Academia TACC RWCP nGrid/eGrid Partners 135Mbps 1.5Mbps APAN Tokyo STAR TAP Chicago 135Mbps TransPAC 100Mbps IMnet vBNS 100Mbps Waseda 384Kbps WIDE 10Mbps 1.5Mbps Frame Relay TIT Australia 10Mbps SINET TACC: Tsukuba Advanced Computing Center Osaka: Osaka University RWCP: Real World Computing Partnership TIT: Tokyo Institute of Technology Waseda: Waseda University 155Mbps Osaka 100Mbps
Super SINET • Similar to Internet 2 • 10GBps backbone interconnecting major Japanese Universities • 10/2.4 GBps link to each Univ. • Collaboration with other national 10GBps backbone projects • E.g., 10GBps backbone in Tsukuba area
Japan AIST/TACC/ETL National Institute of Advanced Industrial Science and Technology Tokyo Institute of Technology Waseda U, Osaka-u, Nara Advanced Institute of S & T HEPL (DataGrid) Australia ANU, Monash U United States PNNL Korea (KORDIC, ) Singapore (NUS) Malaysia Thailand ROC, Hong Kong, Taiwan Other APAN members APGrid Locations/Potential Partners
ApGRID: motivations (1) • Establish a regional wide testbed for global computing (Grid and/or Meta) • Disseminating research activities • Providing an easy-access environment for researchers, students, vendors, etc. • Improving interoperability of existing tools • Testbed for software development and trial to have evaluation of usability and to archive performance numbers • Finding demonstrative applications
ApGRID: motivations (2) • Create a competitive/collaborative community to the iGRID and the eGRID for: • Making international collaborations • Supporting and collaborating with network people, ex. APAN, IM-net, etc. • Attempt to negotiate for standardization with real experience (in Global Gridforum.) • Also, domestic (intra-country) service • Nation-wide • Several “non-cooperative” network communities • Seeking governmental and/or industrial funding • Campus-wide • Find Volunteers within our friends
ApGrid Resources • Gov. Lab. and Univ. Supercomputing centers • MITI TACC, HEPL (DataGrid), etc. • Individual Univ. Lab • TITECH • Waseda Univ. • Nara AIST • Etc.
Network configuration at TACC(AIST Supercomputing Center) Internet 135Mbps/2.4Gbps Firewall RS6000/SP/128 200+GFlops SR8000/64 512GFlops GbE FEx8 Clusters
TACC Resources • High Performance Computing System • SR8000, RS/6000 SP, UE10000, etc. • Super Clusters (Alpha Ev56x40, Ev6x256, …) • High Speed Network • Giga bit campus backbone • ATM Megalink national backbone • 15 national laboratory over Japan • High Speed Internet Access • IM net 135Mbps, StarTAP 100Mbps via TransPAC/APAN • Highly functional Data Base • RIO DB • Visit http://www.aist.go.jp/RIODB
Hitachi SR8000/64 (sr8k) • Power PC + PVP + HXB • 64 nodes • 512 Gflops (peak) • 449.7Gflops (linpack) • 512GB memory • 2D cross bar network • 1.98TB Disks • R&D, Parallel Program development • + 8 nodes Front-end for interactive usage (ex. Global Computing)
IBM RS/6000 SP (rssp) • Power3 SMP 2CPU, Winter hawk • 128 nodes • 205 Gflops (peak) • 149.36Gflops (linpack) • 256GB memory • High speed swtich • 3.3TB (user 873GB) Disks • ISV applications’ platform • + 4way/350MHz P3 x 8nodes front-end WH-II
Bamboo Alpha Cluster • 256 Alpha EV6 500Mhz, 512MB, 256 GFlops Peak • Two-stage Gigabit Ethernet Switch (Myrinet 2K?) • Special Compact-PCI Packaging by Alta Tech. • Linux-based, Commodity Software (Beowulf) • $5 mil • Production Cluster, Operational RSN
TITECH Matsuoka Lab. Grid Clusters • “Very” Commodity clusters as Grid Resources and Research pltfm. • Current 2 clusters192 procs total • 6 clusters, over 400procs/400GFlops by 1Q 2001, Gigabit linkage
Presto I 64 PII-350, 256MB/node Linux + RWC Score + our stuff Semi production, parallel OR algorithm on the Grid Presto II 64 Celeron-900, 512MB/node, multiple interconnect Grid Simulation, HP Java Prospero 64nodex2proc SMP PIII-824, 640MB, 3-trunked 100Base-T (will be 192proc RSN w/6TB disks) General-purpose cluster research, Grid simulation, app. Run (incl. Mcell over the Pacific) Pronto > 64Athlon, > 1.1Ghz, > 512MB DDR-DRAM, Hybrid 1000/100Base-T Semi-production, 1Q2001 Porto Plug & Play Clustering 32 High-Performance Notebooks (600Mhz Mobile Celeron) Pinto 16-32 node Alpha cluster Heterogeneous Clustering over the Grid Total >400nodes The PRESTO Grid Clusters at Matsuoka Lab, TITECH for 2000
Grid Simulation and Performance Benchmarking Cluster Federation w/Grid Commodity High-Performance Networking Incl. OpenMP (w/RWCP) Fault Tolerance and Security Dynamic Plug&Play Cluster Downloadable Self-tuning Java Libs and Apps Applications Operation Research/Control Netsolve MCell run Resource (w/UCSD) Java/Jini-based Grid&Cluster computing Migratory Code Jini-based Cluster Grid Services Resource Publication and Resource Discovery JiPANG Jini-based Grid Portals Architecture (w/UTK) Performance Portability High-Performance Portable Java DSM Open-ended, downloadable JIT Compiler The OpenJIT Proj(w/Fujitsu) Grid Cluster Research at TITECH
Bricks Grid Simulatior (HPDC’99) • Consists of simulated Global Computing Environment and Scheduling Unit. • Allows simulation of various behaviors of • resource scheduling algorithms • programming modules for scheduling • network topology of clients and servers • processing schemes for networks and servers (various queuing schemes) using the Bricks script. • Makes benchmarks of existing global scheduling components available
The Bricks Architecture NetworkPredictor Scheduling Unit Predictor ServerPredictor ResourceDB Scheduler NetworkMonitor ServerMonitor Network Client Server Network Global Computing Environment
Applications on PRESTO Clusters –Op. Research • SCRM(Generalized Quadratic Optimization Algorithm) • Iterative execution of multiple SDP solver w/Ninf via Master-Worker • Some problems 100Fold speedup/128 procs (exec. Time world record) • Other difficult OR problems also very positive -> Larger exection on Cluster Federation resources
Titanium Terascale Grid Cluster • Proposal for 10TF-scale “commodity” cluster at the TITECH computing center • 2 x 500 Itanium-class “commodity” cluster on two TITECH campuses • Interconnect via 2.4 Gigabit WAN • Campus-wide usage with Grid software • Centerpiece of Grid infrastructure within TITECH campus • ApGrid and Global Grid collaboration • 2002-3? W/restructuring of computing center
Titaneum 1 クラスタ 号機 1024 , 100TB プロセッサ ストレジ OS/Grid クラスタ ミドルウェア Grid 学内 ユーザ Grid 学内 ユーザ LAN AP 高速無線 ( ) 教室、研究室等 Gigabit LAN 学内 ImmersaDesk 分散 学内ユーザの自由 Grid な 資源への アクセス Grid 内外の インフラへ 大岡山地区 (NPACI/Alliance/IPG, J - Grid, E - Grid ) など Titanium Cluster Overview • Goal: Construct as “cheap” as possible • Semi-reliable service • Use Grid technology to federate and manage the clusters 大岡山⇔長津田間 2.4G - 10Gbps
ApGrid: Services (1) • Grid computing service • Deploy major grid software packages ready to use • Ninf v.2.0 (Another talk ) • Globus, Netsolve, NWS, Nimrod, Condor Legion,etc. • MPICH/G(2), PACX-MPI, Harness, etc • System resources • US220R x 2CPU x 4 from ETL • ORIGIN 2000/16CPU, J90/16CPU, CS6400/64 • SR8000/8node, WH-II 8node • Clusters (Pentium, Alpha), etc in many places
lapack.ApGrid.org murata.ApGrid.org lapack.eGrid.org Simplified architecture than the Metaserver ・Limit the # of Servers ・Load balancing with L4 switch technology ・Central administration of servers and DB ・Transactions 3-DNS Res DB (“lapack”,”dgesv”, .., ..) 150.29.219.128(VIP) BIG/IP package routine Selector/scheduler BIG/IP Selector/scheduler hpcc.gr.jp 192.50.75.0/24 Different VIP per package e.g. linpack.apgrid.org Grouping of libraries via VIP VIP expands the URL to address of appropriate server Ninf 2.0/netsolve etc ninf.org 150.29.218.0/23
Simplified architecture than the Ninf Metaserver Limit the # of known Servers Load balancing with L4 switch technology Central administration of servers and DB Transaction support Resource access and Load balancing w/VIP Different VIP per package e.g. linpack.apgrid.org Grouping of libraries via VIP VIP expands the URL to address of appropriate server Ninf 2.0/Netsolve etc ASP-Like ApGrid Ninf Service
ApGrid: Services (2) • Grid information service • Maintain name servers and databases • ASP-like portal service • Handling users, micro economics • Grid security support service (Plan) • PKI: Public Key Infrastructure • Certificate Authority
Virutal/Real Client Virutal/Real Client Virutal/Real Client NWS Sensors NWS Sensors NWS Sensors ApGrid Information Services ApGrid Testbed • Resource Info • Performance Monitoring and Archive • Would like to collaborate w/other Grid patners ApGrid nodes in Japan STAR TAP Chicago RWCP TransPAC 100Mbps US and EuropeanPartners ETL/TACC Osaka-U APANTokyo TITECH ApGrid- Korea, Singapore,Australia, etc,
ApGrid: Current Status • Just kicked off, and some of the resources are ready, but still we need: • Hiring people to maintain and to install the regular services initially • Enrolling more partners • Reserved: apgrid.org, Web site will be open shortly • Find international partners • Creating much stronger relation with APAN activities
Summary • Some success stories • Collaboration with Application Scientists • International Collaborations • Osaka-U/UCSD (Globus) • NetSolve/Ninf Collaboration • WGCC2000, Grid Forum, metacomputing WS • Government funded several small projects • the Asia-Pacific Grid (ApGrid) • TACC is ready for providing computing resources • National, Regional testbed • International Collaborations Efforts a MUST!
TACC Overview • Missions • Providing world leadership in advanced computing science and technology through the development and application of computing science and engineering • Organization • MITI/AIST operates directly since 1981 • 2 executive, 7 technical, 2 admin + SEs • Annual budget 2,400M JPY (=20M USD) • Incl. Supercomputer rental, SE, network maintenance, electricity, etc. • Collaborative activities with partners • RWCP, Tsukuba Univ., NAL, Jaeri, KEK, • HRLS, CSAR, SDSC, UTK, LANL, NIST, ETHZ, ANU...
ITBL is NOT • ApGrid nor Japan Grid nor Tokyo Grid nor Tsukuba Grid nor… • An Infrastructure-oriented project • An Application-oriented project • An Earth Simulator-related project • A successor to RWCP • A Grid project • An internationally collaborative project • A domestically collaborative project • A huge project • A Good project (at least to our opinion) • Then, what is IT? • Nobody really knows (or cares) • And thus its objective must be top secret (even to us) • Probably upgrades several supercomputer boxes (箱物) and network links (ゼネコン対策)