140 likes | 212 Views
Two Types of Supercomputer developments. Yutaka Ishikawa RIKEN AICS University of Tokyo. Session 2: Deployed Ecosystems and Roadmaps for the Future. Smoky Mountains Computational Sciences and Engineering Conference. http://computing.ornl.gov/workshops/exascale14/. Supercomputers in Japan.
E N D
Two Types of Supercomputer developments Yutaka Ishikawa RIKEN AICS University of Tokyo Session 2: Deployed Ecosystems and Roadmaps for the Future Smoky Mountains Computational Sciences and Engineering Conference http://computing.ornl.gov/workshops/exascale14/
Supercomputers in Japan K Computer PF FLAGSHIP Machine HPCI (High Performance Computing Infrastructure) is formed from those machines, called leading machines Riken Features: Single sign-on Shared storage (Distributed file system) 9 Universities and National Laboratories • Each supercomputer center has one, two or more supercomputers. • Each supercomputer center replace their machines every 4.5 to 6 years. As of Jun 2012
Procurement Policies in Supercomputer Centers • Flagship-Aligned Commercial Machine (FAC) • Acquiring a machine whose architecture is the same of the flagship machine. • Complimentary Function Leading Machine (CFL-M, CFL-D) • Acquiring a machine whose architecture is different than the flagship machine, e.g. vector machine. • CFL-M: a commercial machine provided by a vendor • CFL-D: a new machine developed by both a vendor and supercomputer center. • Upscale Commodity Cluster Machine (UCC) • Acquiring a large-scale commodity cluster • Technology Path-Forward Machine (TPF) • Design and development of future advanced machine
Supercomputer Centers located at Japanese Universities 100 PF 2MW (CFL-M/TPF+UCC) 10+ PF (CFL-M/TPF + UCC) 1.5 MW Hitachi SR16000/M1 (172 TF, 22TB) Cloud System Hitachi BS2000 (44TF, 14TB) ~1PF ,~1PB/s(CFL-M) ~2MW NEC SX-9 + Exp5800 (31TF) 30+PF, 30+PB/s (CFL-D) ~5.5MW(max) HA-PACS (800 TF) -50 PF (TPF) 2MW (Manycore system) (700+ TF) Post T2K -- 30 PF (UCC + TPF) 4MW 100+ PF (UCC + TPC) 4MW T2K Todai (140 TF) Fujitsu FX10 (1PFlops, 150TiB, 408 TB/s), Hitachi SR16000/M1 (54.9 TF, 10.9 TiB, 5.376 TB/s) 50+ PF (FAC) 3MW Tsubame 2.0 (2.4PF, 97TB, 744 TB/s)1.8MW Tsubame 2.5 (5.7 PF, 110+ TB, 1160 TB/s), 1.8MW Tsubame 3.0 (20~30 PF, 2~6PB/s) 1.8MW(Max 3MW) Tsubame 4.0 (100~200 PF, 20~40PB/s), 2.3~1.8MW (Max 3MW) Fujitsu M9000(3.8TF, 1TB/s) HX600(25.6TF, 6.6TB/s) FX1(30.7TF, 30 TB/s) 50-100 Pflops (FAC + UCC) 100~200 PF (FAC/TPF + UCC) Fujitsu FX10 (90.8TF, 31.8 TB/s), CX400(470.6TF, 55 TB/s) Upgrade(3.6PF) 3MW 4MW Cray XE6 (300TF, 92.6TB/s), GreenBlade8000 (243TF, 61.5 TB/s) 6-10 PF (FAC/TPF + UCC)1.8 MW 100+ PF (FAC/TPF + UCC) 1.8-2.4 MW Cray XC30 (400TF) 600TF SX-8 + SX-9 (21.7 TF, 3.3 TB, 50.4 TB/s) 500+ TB/s (CFL-M) 1.2 MW 5+ PB/s (TPF) 1.8 MW 5-10 PF (FAC) Hitachi SR1600(25TF) Hitachi HA8000tc/ Xeon Phi (712TF, 242 TB) , SR16000(8.2TF, 6 TB) 100-150 PF (FAC/TPF + UCC) 3MW 2.0MW 2.6MW Fujitsu FX10(270TF)+FX10相当(180TF), CX400/GPGPU (766TF, 183 TB) 10-20 PF (UCC + TPF)
Towards the Next Flagship Machine PF Post K Computer RIKEN 9 Universities and National Laboratories PostT2K U. of Tsukuba U. of Tokyo • PostT2K is a production system operated by both Tsukuba and Tokyo • System software and parallel programming language in PostT2K will be employed in a part of Post K’s software environment • Machine resources will be used to develop system software stack in PostK U. of Tsukuba U. of Tokyo Kyoto U. T2K
PostT2K Procurement Development • Hardware • Latest CPU technology is assumed • Specifying • Node Performance, Memory Capacity/Bandwidth, Interconnect performance, File I/O performance, Storage Capacity • Software • Specifying • Operating System (Linux and McKernel) • Programming Languages (Fortran, C/C++, Xcalable MP) • Communication Library (MPI-3) • Math Libraries • File System • Batch Job System • McKernel • Light Weight Microkernel • Xcalable MP • Parallel Programming Language • MPICH with Low-level Communication Facility
Linux + McKernel • Concerns • Reducing memory contention • Reducing data movement among cores • Providing new memory management • Providing fast communication • Parallelizing OS functions achieving less data movement • New OS mechanisms and APIs are revolutionarily/evolutionally created and examined, and selected • Linux with Light Weight Micro Kernel • IHK (Interface for Heterogeneous Kernel) • Loading a kernel into cores • Communication between Linux and the kernel • McKernel • Customizable OS environment • E.g. environment without CPU scheduler (without timer interrupt) System call to LMK System call to Linux Daemon Daemon Daemon User process User process Linux Kernel McKernel Interface for Hetero. Kernels Core Core Core Core Running on both Xeon and Xeon-phi environments IHK and McKernel have been developed at the University of Tokyo and Riken with Hitachi, NEC, and Fujitsu
PostT2K OS Environment being developped • Linux Kernel+McKernel • Several variations of McKernelare provided for applications • Linux Kernel resides, but an McKernel is selectively loaded for each application Linux kernel is resident App A on McKernel without CPU scheduler Is invoked App Con McKernel with Segmentation is invoked Finish Finish App B on McKernel with CPU scheduler Is invoked App D on Linux Is invoked Finish Finish
XcalableMP(XMP)http://www.xcalablemp.org • Language Features • Directive-based language extensions for Fortran and C for PGAS model • Global view programming with global-view distributed data structures for data parallelism • SPMD execution model as MPI • pragmas for data distribution of global array. • Work mapping constructs to map works and iteration with affinity to data explicitly. • Rich communication and sync directives such as “gmove” and “shadow”. • Many concepts are inherited from HPF • Co-array feature of CAF is adopted as a part of the language spec for local view programming (also defined in C). • What’s XcalableMP (XMP for short)? • A PGAS programming model and language for distributed memory , proposed by XMP Spec WG • XMP Spec WG is a special interest group to design and draft the specification of XcalableMP language. It is now organized under PC Cluster Consortium, Japan. Mainly active in Japan, but open for everybody. • Project status (as of Nov. 2013) • XMP Spec Version 1.2 is available at XMP site. new features: mixed OpenMP and OpenACC , libraries for collective communications. • Reference implementation by U. Tsukuba and Riken AICS: Version 0.7 (C and Fortran90) is available for PC clusters, Cray XT and K computer. Source-to- Source compiler to code with the runtime on top of MPI and GasNet. Code example XMP provides a global view for data parallel program in PGAS model
Development, Maintenance and Promotion Support Support PostT2K Vendor Vendor PostK Contribution Contribution Contribution Contribution IHK, McKernel, LLC, XMP Roles of PC Cluster Consortium PC cluster consortium was established in 2001. The original mission was to contribute to the PC cluster market through the development, maintenance, and promotion of cluster system software based on the SCorecluster system software and Omni OpenMP compiler, developed by the Real World Computing Partnership funded by the Japanese government from 1992 for 10 years. Members: Univ. of Tsukuba, Univ. of Tokyo, Titech, AMD, Intel, Fujitsu, Hitachi, NEC, Cray, … • Integration of other open sources, e.g., MPICH • Distributionas open source • Promotion
International Collaboration between DOE and MEXT PROJECT ARRANGEMENTUNDER THE IMPLEMENTING ARRANGEMENTBETWEENTHE MINISTRY OF EDUCATION, CULTURE, SPORTS, SCIENCE AND TECHNOLOGY OF JAPANANDTHE DEPARTMENT OF ENERGY OF THE UNITED STATES OF AMERICACONCERNING COOPERATION IN RESEARCH AND DEVELOPMENT IN ENERGY AND RELATED FIELDSCONCERNING COMPUTER SCIENCE AND SOFTWARE RELATED TO CURRENT AND FUTURE HIGHPERFORMANCECOMPUTINGFOROPENSCIENTIFICRESEARCH Yoshio Kawaguchi (MEXT, Japan) and William Harrod(DOE, USA) Purpose: Work together where it is mutually beneficial to expand the HPC ecosystem and improve system capability • Each country will develop their own path for next generation platforms • Countries will collaborate where it is mutually beneficial • Joint Activities • Pre-standardization interface coordination • Collection and publication of open data • Collaborative development of open source software • Evaluation and analysis of benchmarks and architectures • Standardization of mature technologies Technical Areas of Cooperation • Kernel System Programming Interface • Low-level Communication Layer • Task and Thread Management to Support Massive Concurrency • Power Management and Optimization • Data Staging and Input/Output (I/O) Bottlenecks • File System and I/O Management • Improving System and Application Resilience to Chip Failures and other Faults • Mini-Applications for Exascale Component-Based Performance Modelling
Concluding Remarks • Ecosystem • Co-development of system software stack for a leading machine (PostT2K) and the flagship machine (PostK) • Beneficial to users • Continuity of System Software and Programming Language from leading machines to the flagship machine • Contribution to open source community • Shared and Enhanced by the community • Schedule Procurement Software Development Operation PostT2K Basic Design Design and Implementation Manufacturing, Installation, and Tuning Operation PostK
The overall theme of SMC2014 is "Integration of Computing and Data into Instruments of Science and Engineering". • Our session is focused on "Deployed Ecosystems and Roadmaps for the Future ". We will be focusing on current experiences and challenges in deploying large scale computing capabilities and our plans and expectations on how future systems will be made available to our scientists and engineers. • Consistent with this topic, we are inviting you share your vision for how the computational ecosystem may continue to develop to serve the scientific and engineering challenges of the future. • The three other panels in our conference will focus on "Strategic Science: Drivers of Future Innovation", "Future Architectures to Co-Design for Science", and "Math and Computer Science Challenges for Big Data, Analytics, and Scalable Applications".