130 likes | 299 Views
Programmability. Hiroshi Nakashima Thomas Sterling. Key Challenges (1). Parallelism Expose sufficient parallelism (multi billion-way) Manage the massive parallelism in ensemble (hierarchy) Reveal rich form and granularity of parallelism Efficient exploitation of fine grained parallelism
E N D
Programmability Hiroshi Nakashima Thomas Sterling
Key Challenges (1) • Parallelism • Expose sufficient parallelism (multi billion-way) • Manage the massive parallelism in ensemble (hierarchy) • Reveal rich form and granularity of parallelism • Efficient exploitation of fine grained parallelism • Distribution and resource assignment • Enables exploitation of separate concurrency of action • Need for some kind of global name space • Locality management • Reduces latency of access and control • Exposure of object and control affinity
Key Challenges (2) • Management of memory hierarchy • Transparent cache misses • Finite cache size and structure • Copy semantics and consistency(?) • Latency hiding • Already said locality management • Intrinsic overlap of communication with computation to mitigate impact • Hardware idiosyncrasies • E.g., TLB misses • Non-deterministic resolution of shared resource contention • Branch prediction, register renaming, etc.
Key Challenges (3) • Legacy codes may not meet requirements for future Exascale systems • Rewrite only once, please. • What is the paradigm or execution model for the programming model to satisfy and cooperate with remaining system components? • Distribution of responsibilities across system components • Libraries • Code reuse • Decouples performance issue from logical function • Can adapt to your program requirements • Should learn about your data structure, not you about library
Key Challenges (4) • Interoperability • Between cooperating concurrently executing functionality • Exploit existing legacy codes during transitional periods • Minimization of performance sensitivity • Robust guarantees of correctness of result • Elimination of over constraining synchronization bottlenecks • e.g., global barriers • Lightweight synchronization • Re-empower strong scaling • Portability • Different systems • Different scale • Different generations
Potential Impact on Software Component • Need for new model of computation • Programming model reflects user program parallelism • Runtime system make available runtime information for decision chain • Architecture and runtime minimize overhead to enable useful rich mechanisms for control, cooperation, and sharing • Asynchrony management for out of order arrival of data transfers and service completion • Guaranteed compound atomic operations for user programmed segments with efficient protection • OS protocol to inform runtime system – bi directional exchange
Summary of Research Directions • Separation of logical functionality from performance attributes • New model of computation • Diversity of parallelism forms and sizes • Data directed execution • Dynamic graph-based problems, encoding, and control • New programming models that interoperate with old • Dealing with memory hierarchies • Advanced runtime systems • Requirements for new ultra massive architecture • Automatic runtime tuning for heterogeneous architectures
Potential Impact on Usability, Capability, & Breadth of Community • Enormous • Essential • Ease of use • Eternal • Everyone
4.x Programmability Cross-cutting property of concurrency as it relates to programmability 10 billion-way parallelism Billion-way parallelism Exposed Concurrency 10 million-way parallelism 100 million-way parallelism 100 thousand way parallelism Million-way parallelism 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
4.x Programmability • Technology drivers • Programming models and languages • Compiler analysis, distribution, and allocation • Runtime system software • OS • Architecture structure, semantics, and mechanisms
4.x Programmability • Alternative R&D strategies • Models of computation • Message passing with multi threaded processes • Message-driven work-queue multithreaded • Programming models • MPI-8 • Event-driven multithreaded with GAS • DSP and Declarative • Runtime system software
4.x Programmability • Recommended research agenda • Model of computation • Decision chain across system layers • Protocols between successive layers
4.x Programmability • Crosscutting considerations • It is one • Performance • major hazard for programmabiltiy • Reliability • Does the application program play a role in determining response to faults