Advancing the “Persistent” Working Group

Advancing the “Persistent”Working Group MPI-4 Tony Skjellum tony@cis.uab.edu June 5, 2013

Persistence Goals • Cross-cutting use of “planned transfers” • Point to point • Collective • Generalized requests • Allow program to express • Locality, static use of resources • Connectivity of sends/receives • Purpose: Improve speed and scalability for regular, static communication

Concepts review • Drew analogies from MPI/RT • Point-to-point Channels vs half-channels • Slack-1 “bind a send” to “a receive” • Goal: cut the # of instructions in critical path • Reuses “Send_init” and “Recv_init” • Open issues • Multiple slack channels • Rebinding when some arguments change

Concepts review • For all non-blocking collective, we can define “persistent non-blocking collective” • The approach in MPI/RT was to have collective channels as well as point to point • Again, the goal is to allow for static resource management (memory, network, etc), algorithm selection in advance, and single-mode performance of perations

Additional – Differentiated Service • Each communicator is a separate “MPI fabric” or independently ordered overlay network that is all-to-all • Allow communicators to offer differentiated service • No tags matching • No wildcards • Guarantee of posting (F mode), S mode … • Strong or weak or no progress • Independent buffering rules and even short/long cutoffs • Degree of correctness (e.g., OK to make data errors, vs. MPI-1 compliant “all correct”) • QoS concepts could be added as well • Subset of MPI that is usable • Goal: allow each communicator fabric to operate at optimal performance and scalability and even to make error/fault choices to support FT where appropriate • Use MPI_Comm_dup and MPI_Comm_create as prototypes for generating such functions

Steps that were for March (doing in part now in June) • Revive proposals concerning the pt2pt specific calls, and general framework for collective • See if anyone else is interested in helping on committee • Get straw indication on differentiated service concept • Review status of generalized requests and left over ideas in MPI-3, see if persistent generalized requests are useful • Entertain any other ideas about “making” MPI pt2pt and collective faster by using planned transfer

May 2011 Document • We have a 12-page document from 2011 that we held back while MPI-3 was finishing • It contains basic proposals • We’d like to refine these and update the May 2011 document after this working group meeting today • We’d like to flesh out any concerns not yet fully understood • So, first a couple of slides, then we go through the document, and capture action items and ideas and next steps!

Getting back to a specific pt2pt proposal (#1) • Very short numbers of instructions to do a persistent send/receive pair • MPI_Send_init() • MPI_Recv_init() • They are not enough, because they don’t lock down all the resources (they are ‘half’ channels) • Motivation: “Darius Buntinas measurement” – 250+ instructions in shared memory…. Must be reduced

Requirements • The minimum number of instructions in critical path to launch a send for “Nth” time after “setup” • Arguments frozen after first time • All • Most (except starting address, length) • Want the “Start” to track right to “kick DMA” or “kick transfer offload engine”… 1 OS bypass / application bypass instruction • Makes regular programs finer grain in O/H & Latency • Provides its own communication space/order/matching (outside of communicator) • Use all the features you can use in an MPI_Isend; simple datatypes may be faster in practice • Easy changes to existing MPI programs that are already static/data parallel (like 2D Life)

Where we got into the weeds before • How do we create the channel (1 easy, what a bout several?) • How many messages in flight? At least 1. • How do we avoid extra signaling that makes this like stop-and-wait at the LLC layer? More than 1 to fill virtual channel. • Let’s go through the document to refresh and decide next steps…

Q&A

Advancing the “Persistent” Working Group

Advancing the “Persistent” Working Group

Presentation Transcript

The American Oystercatcher Working Group

Mobile Broadband Working Group Spectrum Policy Working Group

Working with the Persistent Chat Platform in Lync 2013

Fellowship Working Group

WORKING GROUP

Working Group Status

The HEPiX IPv6 Working Group

The HEPiX IPv6 Working Group

Rebooting the “Persistent” Working Group

Working group

The HEPiX IPv6 Working Group

The HEPiX IPv6 Working Group

Advancing the Story

ADVANCING THE ORDINARY

Working Group Reports Working Group 1

Presentation in the Working group

The Risk Management Working Group

The HEPiX IPv6 Working Group

eHealth Working Group (previously Working Group T2)

PRAGMA 25 Working Group Report Resources Working Group

The HEPiX IPv6 working group

The Campus Cyberinfrastructure Working Group