1 / 32

High Productivity Computing Systems Program

High Productivity Computing Systems Program. CASC HPC Technology Update. Robert Graybill March 24, 2005. Outline. High Computing University Research Activities HECURA Status High Productivity Computing Systems Program Phase II Update Vendor Teams Council on Competitiveness

brenna
Download Presentation

High Productivity Computing Systems Program

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Productivity Computing Systems Program CASC HPC Technology Update Robert Graybill March 24, 2005

  2. Outline • High Computing University Research Activities • HECURA Status • High Productivity Computing Systems Program • Phase II Update • Vendor Teams • Council on Competitiveness • Productivity Team • Phase III Concept • Other Related Computing Technology Activities

  3. HECURA – High-End Computing University Research Activity Strategy: • Fund universities in high-end computing research targeting Nation’s key long-term needs. Implementation: • Coordinated FY04 Solicitation led by NSA (DARPA $2M Per Year) • $ 1M sent to DOE • $ 1M sent to NSF • Participating Agencies • DARPA, DOE, NSF • Status • DOE FY04/05 – Fund FASTOS • NSF FY04 – Fund Domain Specific Compilation Environment • NSF FY05 – In Process Potential High-End Computing Research Areas

  4. Outline • High End Computing University Research Activities • HECURA Status • High Productivity Computing Systems Program • Phase II Update • Vendor Teams • Council on Competitiveness • Productivity Team • Phase III Concept • Other Related Computing Technology Activities

  5. High Productivity Computing Systems Goal: • Provide a new generation of economically viable high productivity computing systems for the national security and industrial user community (2010) Impact: • Performance (time-to-solution): speedup critical national security applications by a factor of 10X to 40X • Programmability (idea-to-first-solution): reduce cost and time of developing application solutions • Portability (transparency): insulate research and operational application software from system • Robustness (reliability): apply all known techniques to protect against outside attacks, hardware faults, & programming errors HPCS Program Focus Areas Applications: • Intelligence/surveillance, reconnaissance, cryptanalysis, weapons analysis, airborne contaminant modeling and biotechnology Fill the Critical Technology and Capability Gap Today (late 80’s HPC technology)…..to…..Future (Quantum/Bio Computing)

  6. HPCS Program Phases I - III Products Early Pilot Platforms HPCS Intermediate Products Productivity Framework Baseline Experimental Productivity Framework Productivity Concepts & Metrics Productivity Assessment (MIT LL, DOE, DoD, NASA, NSF) Pilot Systems System Design Review Concept Review PDR CDR Industry Milestones 6 4 1 7 3 2 5 Technology Assessment Review Procurement Decisions 02 03 04 05 06 07 08 09 10 Year (CY) (Funded Three) Phase II R&D Mission Partners (Fund up to Two) Phase III Full Scale Development (Funded Five) Phase I Industry Concept Study Program Reviews Critical Milestones Program Procurements

  7. Phase II Program Goals • Phase II OverallProductivity Goals • Execution (sustained performance) – 1 Petaflop/sec (scalable to greater than 4 Petaflop/sec). Reference: Functional Workflow 3 • Development – 10X over today’s systems. Reference: Functional Workflows 1,2,4,5 • Productivity Framework • Establish experimental baseline • Evaluate emerging vendor execution and development productivity concepts • Provide a solid reference for evaluation of vendor’s Phase III designs • Provide technical basis for Mission Partner investment in Phase III • Early adoption or phase in of execution and development metrics by mission partners • Subsystem Performance Indicators (Vendor Generated Goals from Phase I) • 3.2 PB/sec bisection bandwidth; • 64,000 GUPS (RandomAccess) • 6.5 PB/sec data streams bandwidth; • 2+ PF/s Linpack HPCchallenge Documented and Validated Through Simulations, Experiments, Prototypes, and Analysis

  8. HPCS I/O Challenges • 1 Trillion files in a single file system • 32K file creates per second • 10K metadata operations per second • Needed for Checkpoint/Restart files • Streaming I/O at 30 GB/sec full duplex • Needed for data capture • Support for 30K nodes • Future file system need low latency communication An Envelope on HPCS Mission Partner Requirements

  9. Phase II Accomplishments • Unified and mobilized broad government agency buy-in …. (vision, technical goals, funding and active reviewers) • Driving HPC vendor and industry users’ vision of high-end computing ---- “To out-compete … We must out-compute!” • Completed Program Milestones 1 - 4 • SDR – Established credible technical baseline, assessed program goals and identified challenges • Technology Assessment Review • Established “Productivity” as a key evaluation criteria rather than only ‘Performance’ through HPCS Productivity Team efforts • Released “execution time” HPCchallenge & in-the-large applications benchmarks • Completed early “Development Time” experiments • Early commercial buy-in …… Parallel Matlab Announcement • FY04 HEC-URA awards completed through DOE and NSF • Developed Draft Phase III Strategy

  10. HPCS System ArchitecturesCray / Sun / IBM Addressing Time-to-Solution Experimental Codes Large Multi-Module Codes Porting Codes Running Codes Administration R&D in New Languages Chapel (Cray) X10 (IBM) Fortress (Sun)

  11. HPCS Vendor InnovationsNon-Proprietary Version • “Super Sized” scaled up HPC development environments, runtime software, file I/O and streaming I/O to support 10k to 100K processors • Intelligent continuous processing optimization (CPO) • Application optimized configurable heterogeneous computing • Workflow based productivity analysis • High bandwidth module/cabinet interconnect fabric • Capacitive proximity chip/module interconnect – Breaks bandwidth cost/performance barriers • Developed prototype high productivity languages • On the track for 10X improvement in HPC productivity HPCS Disruptive Technology Will Result in Revolutionary HPC Industry Products in 2010 HPCS Technology has Already Impacted Vendors 2006/2007 Products

  12. Near Term Meetings • Petascale Applications Workshop • March 22-23 Chicago – Argonne National Lab • Next HPCS Productivity Team/Task Group meeting • June 28-30, 2005 Reston, VA (General Productivity session & individual team meetings) • Second Annual Council on Competitiveness Conference - HPC: Supercharging U.S. Innovation and Competitiveness • July 13, 2005 Washington, DC • Milestone V Industry Reviews (Two days) • Week of July 25th (Sun, Cray) and August 2,3 or 4 (IBM) • Standard Review plus special emphasis on Productivity

  13. Outline • High Computing University Research Activities • HECURA Status • High Productivity Computing Systems Program • Phase II Update • Vendor Teams • Council on Competitiveness • Productivity Team • Phase III Concept • Other Related Computing Technology Activities

  14. HPC Industrial Users Survey:Top-Level Findings • High Performance Computing Is Essential to Business Survival • Companies Are Realizing a Range of Financial and Business Benefits from Using HPC • Companies Are Failing to Use HPC as Aggressively as They Could Be • Business and Technical Barriers Are Inhibiting the Use of Supercomputing • Dramatically More Powerful and Easier-to-Use-Computers Would Deliver Strategic, Competitive Benefits

  15. Blue-Collar Computing 8 Ideal Market for HPC Increased Productivity Gains In Industry and Engineering Blue-Collar HPC Increased Gains in Scientific Discovery Number of Tasks Number of Users Number of Applications Easy Pickings Competitive Necessity Business ROI Current Market for HPC Programmer Productivity Heroes 1 2 4 64 DoD NSF DoE Amount of Computing Power , Storage , & Capability # of Dollars

  16. HPC ISV Phase I Survey:Early Findings – Results in July 05 Biosciences 66CAE 112Chemistry 30Climate 2*DCC&D 1EDA 21Financial 7General Science 105*General Visualization 6Geosciences 21*Middleware 79Weather 3*Unknown 7Grand Total 460 • So far we have identified 460 ISV packages that are supplied by 279 organizations. • Some are middleware and some may be cut as we refine the data. • Domestic/Foreign Sources will be identified • Issue is that very few of them will scale to peta-scale systems

  17. Benchmarks Kernel, Compact & Full System Parameters (Examples) BW bytes/flop (Balance)Memory latencyMemory size…….. Exe Time Experiments Exe Interface Processor flop/cycle Processor integer op/cycleBisection BW……… Reliability Actual System or Model Productivity Metrics Productivity Work Flows (Utility/Cost) Common Modeling Interface Size (ft3)Power/rackFacility operation ………. Dev Time Experiments Portability Dev Interface Code size Restart time (Reliability) Code Optimization time ……… Productivity Framework • Captures major elements that go into evaluating a system • Builds on current HPC acquisition processes

  18. Sponsors Bob Graybill DARPA Fred Johnson DOE SC Mission Partners Vendor Productivity POCs NSA NRO DOE HPCMO NASA NSF David Mizell CRAY Larry Votta SUN TBD IBM Productivity Team Mgmt. Jeremy Kepner LINCOLN Bob Lucas ISI Execution Time Models Test & Spec High Prod. Lang. Systems Bob Lucas ISI Cray(2) Sun IBM CalTech UMD UNM ISI(3) Lanl(2) SDSC Lincoln(2) MITRE UMN ORNL Sandia Ashok Krishnamurthy OSU Cray(2) Sun(3) IBM(2) NSA(2) Uwisc UCB UNM Codesourcery OSU(2) ISI NRO(2) Instrumental ILincoln(4) MITRE Rusty Lusk ANL Hans Zima JPL … Productivity Team Development Experiments Benchmarks Existing Code Analysis Workflows, Models, Metrics Vic Basili UMD Cray(3) Sun(5) IBM(5) ARSC UDel Pitt UCSB(2) UMD(8) MissSt ISI(3) Vanderbilt(2) Lincoln(4) LLNL MIT(2) MITRE NSA(2) PSC SDSC(2) David Koester MITRE Cray(2) Sun(6) IBM(3) UIUC(2) UMD(3) UTK(2) UNM ERDC GWU HPCMO ISI(2) LANL(3) LBL Lincoln(4) MITRE UMN NSA(2) ORNL OSU Sandia SDSC(3) Doug Post LANL Cray(2) Sun(5) IBM(6) ARL UMD Oregon MissSt DOE HPCMO LANL(5) ISI Vanderbilt(2) Lincoln(4) ANL MITRE NASA ORNL(2) SAIC Sandia NSA Jeremy Kepner LINCOLN Cray(4) Sun(7) IBM(6) ARL UMD(4) Oregon MissSt LANL ISI Lincoln(4) MITRE UMN NASA(2) DOE

  19. Benchmarks Kernel, Compact & Full System Parameters (Examples) BW bytes/flop (Balance)Memory latencyMemory size…….. Exe Time Experiments Exe Interface Processor flop/cycle Processor integer op/cycleBisection BW……… Reliability Actual System or Model Productivity Metrics Productivity Work Flows (Utility/Cost) Common Modeling Interface Size (ft3)Power/rackFacility operation ………. Dev Time Experiments Portability Dev Interface Code size Restart time (Reliability) Code Optimization time ……… Productivity Research Teams Benchmark Working Group Lead:David Koester MITRE Test & Spec Working Group Lead: Ashok Krishnamurthy OSU Execution Time Working Group Lead: Bob Lucas USC ISI Workflows Models & Metrics Working Group Lead: Jeremy Kepner Lincoln Existing Codes Working Group Lead: Doug Post LANL Development Time Working Group Lead: Vic Basili UMD High Productivity Language Systems Working Group Lead: Hans Zima JPL Distributed Team Involving a Large Cross Section of the HPC Community

  20. Researcher? Enterprise? Production? Constant U U T T T T General Productivity Formula  = productivity [utility/$] U = utility [user specified] T = time to solution [time] C = total cost [$] CS = software cost [$] CO = operation cost [$] CM = machine cost [$] • Utility is value user places on getting a result at time T • Software costs include time spent by users developing their codes • Operating costs include admin time, electric and building costs • Productivity formula is tailored by each user through use of functional work flows • Developing Large multi-module codes • Developing Small Codes • Running applications • Porting codes • Administration U U

  21. Level 1 Functional WorkflowsEnable Time-to-Solution Analysis Writing Small Codes (2) (1) Writing Large Multi-Module Codes (3) Running Codes Formulate questions Develop Approach Develop Code V&V Production Runs Decide; Hypothesize Analyze Results (4) Porting Code Identify Differences Change Code Optimize (5) Administration HW/SW Upgrade Security Management Resource Management Problem Resolution • Mission Partners may create their own HPC usage scenarios from these basic work flow elements • Item in red represent areas with highest HPC specific interest

  22. Small Code Level 2 Work Flow ExampleMarkov Model - Classroom (UCSB) Data Formulate Program 1.0 / 0s 1.0 / 355s Compile Debug 1.0 / 49s .95 / 5s Test .002 / 5s .048 / 9s 1.0 / 629s Compile Optimize .266 / 5s 1.0 / 30s Run .699 / 4s .035 / 3s

  23. HPCS Benchmark Spectrum Execution andDevelopment Indicators System Bounds Execution Indicators Discrete Math … Graph Analysis … Linear Solvers … Signal Processing … Simulation … I/O 3 Scalable Compact Apps Pattern Matching Graph Analysis Signal Processing 3 Petascale/s Simulation(Compact)Applications Others ClassroomExperimentCodes Execution Bounds Current UM2000 GAMESS OVERFLOWLBMHD/GTC RFCTH HYCOM Near-Future NWChem ALEGRA CCSM Local DGEMM STREAM RandomAccess 1D FFT HPCSSpanning Setof Kernels Future Applications Emerging Applications Existing Applications Reconnaissance Simulation Intelligence Global Linpack PTRANS RandomAccess 1D FFT 8 HPCchallenge Benchmarks (~40) Micro & KernelBenchmarks (~10) Compact Applications 9 SimulationApplications • Spectrum of benchmarks provide different views of system • HPCchallenge pushes spatial and temporal boundaries; sets performance bounds • Applications drive system issues; set legacy code performance bounds • Kernels and Compact Apps for deeper analysis of execution and development time

  24. HPCchallenge Bounds Performance HPCS Challenge Points HPCchallenge Benchmarks http://icl.cs.utk.edu/hpcc/ • HPCchallenge • Pushes spatial and temporal boundaries • Defines architecture performance bounds

  25. HPCchallenge WebsiteKiviat Diagram Example — AMD Configurations Not all TOP500 systems are created equal !! HPCS/Mission Partner Productivity Team is Providing an HPC System Analysis Framework

  26. Development Time Activities (1)Victor R. Basili - Team Lead • Created the infrastructure for conducting experimental studies in the field of high performance computing program development • Designed and conducted Classroom studies A Total of 7 HPC classes were studied and data from 15 assignments was collected and analyzed • Designed and conducted observational studies (Study HPC experts working on small assignments) 2 observational studies have been conducted and analyzed • Designed and conducted case studies (study HPC experts working on real projects) Conducted 2 case studies 1 of which completed • Developed a refined experimental design for experiments in 2005

  27. Development Time Activities (2) • Developed a downloadable instrumentation package Looking for expert volunteers to download and use the package • Built knowledge about how to conduct experiments in the HPC environment • Tested and evaluated data collection tools • Hackystat • Eclipse • Developed new hypotheses • Developed and analyzed list of HPCS folklore • Developed and analyzed list of common HPCS defects

  28. Measuring Development Time Real Applications 2 case studies Small Projects 7 HPC classes studied (15 projects, ~100 students) Validity 2 observational studies HPC Center Tutorials Classroom Studies new data collection tools (Hackystat, Eclipse) developed downloadable package Cost • Developing a new methodology for conducting these tests • Comparing programming models and languages • Measuring: performance achieved, effort, and experties • Workflows: steps and time spent in each step

  29. Outline • High Computing University Research Activities • HECURA Status • High Productivity Computing Systems Program • Phase II Update • Vendor Teams • Council on Competitiveness • Productivity Team • Phase III Concept • Other Related Computing Technology Activities

  30. HPCS Draft Phase III Program Productivity Assessment (MIT LL, DOE, DoD, NASA, NSF) Final Demo Early Demo System Design Review Concept Review CDR DRR SCR PDR Industry Milestones 6 4 1 7 3 2 5 Technology Assessment Review SW Dev Unit SW Rel 3 SW Rel 1 SW Rel 2 HPLS Plan MP Peta-Scale Procurements Deliver Units Mission Partner Peta-Scale Application Dev Mission Partner System Commitment Mission Partner Dev Commitment MP Language Dev 11 02 03 04 05 06 07 08 09 10 Year (CY) Phase III System Development & Demonstration (Funded Five) Phase I Industry Concept Study (Funded Three) Phase II R&D Mission Partners Program Reviews Critical Milestones Program Procurements

  31. Outline • High Computing University Research Activities • HECURA Status • High Productivity Computing Systems Program • Phase II Update • Vendor Teams • Council on Competitiveness • Productivity Team • Phase III Concept • Other Related Computing Technology Activities

  32. Protocols Mission Micro Architectures Vdd Scaling Clock Gating Compilers/OS Algorithms Related Technologies Systems That Know What They’re Doing • Intelligent Systems • - Architectures for Cognitive Information Processing (ACIP) • High-End Application Responsive Computing • High Productivity Computing Systems Program (HPCS) • Mission Responsive Architectures • Polymorphous Computing Architectures Program (PCA) • Power Management • Power Aware Computing and Communications Program (PAC/C) + HECURA + OneSAF Objective System + XPCA

More Related