230 likes | 340 Views
Middleware Activities from the Paradyn Project. Barton Miller University of Wisconsin-Madison Condor Week May 2003. Two Complementary Activities. MRNet: A multi-cast/reduction infrastructure for distributed tools Scalable: sizes to many 1000’s of nodes High throughput, low latency
E N D
Middleware Activities from the Paradyn Project Barton Miller University of Wisconsin-Madison Condor Week May 2003 Multicast/Reduction Network
Two Complementary Activities MRNet: A multi-cast/reduction infrastructure for distributed tools • Scalable: sizes to many 1000’s of nodes • High throughput, low latency TDP: A standard protocol for deploying run-time tools in a distributed environment. • Too many job/process control environments • Too many run-time tools. • The never-ending porting task is a fundamental barrier to tool availability. Multicast/Reduction Network
MRNet Overview Tool Front End • Problem: Front-end centralization leads to poor scalability • Large fan-out • Front-end processing • Large data volumes • Goal: improve scalability and efficiency of groupcommunication d0 d1 d2 d3 dn-4 dn-3 dn-2 dn-1 a0 a1 a2 a3 an-4 an-3 an-2 an-1 Multicast/Reduction Network
MRNet Overview Tool Front End • Multicast/Reduction Network is developed as a part of Paradyn’s scalability initiative. • MRNet provides scalable group communication and data aggregation. … … … d0 d1 d2 d3 dn-4 dn-3 dn-2 dn-1 … a0 a1 a2 a3 an-4 an-3 an-2 an-1 Multicast/Reduction Network
Topologies There are many choices for multi-cast/reduction topologies: • Balanced vs. skewed trees • Fan-out • Co-locating communication and computation nodes vs. separate nodes • Geographic placement MRNet accepts an separately generated topology file with layout and interconnect. • We are agnostic to the above choices. • We provide a standard set of topology generators. • It is trivial for you to provide your own. Multicast/Reduction Network
MRNet Internal Processes Front-End BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE Multicast/Reduction Network
MRNet Communicators Front-End Communicators:group back-ends forcommunication BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE Multicast/Reduction Network
Tools link with libmrnet, a library that exposes the MRNet API. Abstractions include: Network: Initialize/shut-down network Access network end-points End-points Communicators Streams Front-End BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE BE MRNet Interface Multicast/Reduction Network
MRNet Internal Processes Packet Batching/Unbatching Transformation Filter Data Encoding Functional layers of MRNet Internal Processes. Data Transformation Operation Data Decoding Synchronization Filter Packet Batching/Unbatching Multicast/Reduction Network
MRNet in Paradyn Start-up Smg2000 on ASCI Blue Pacific Multicast/Reduction Network
TDP: The Challenge Consider remote process management environments: • Condor, LSF, etc. • MPI • Portable MPI (such as MPICH) • Vendor provided MPI (such as IBM, Compaq, Sun) • Globus Each of these environments needs to monitor and control the state of its application processes. Multicast/Reduction Network
Typical Process Manager Process manger: • Starts the remote job • Monitors its status • Controls the job • Sets up file I/O • Sets up standard I/O Remote Host Remote Process Manager monitor/ control Application Process Application Process Multicast/Reduction Network
Typical Process Manager The run-time tool? • Also may want to start process (or attach to it) • Also needs to monitors its status • Also may want to control the job • Needs to communicate with its front-end. Remote Host Remote Process Manager Tool Dæmon Process monitor/ control ? Application Process Application Process ? Multicast/Reduction Network
Typical Process Manager Remote Host Remote Process Manager Tool Dæmon Process monitor/ control ? Application Process Application Process So, who wins? ? Multicast/Reduction Network
Typical Process Manager Remote Host Local Host Remote Process Manager Tool Front-End Process Tool Dæmon Process monitor/ control ? Application Process Application Process ? Multicast/Reduction Network
Current State of Affairs • Each process manager starts and controls processes in its own way. • E.g., even within MPI: IBM POE MPI, SGI Origin MPI and MPICH all work differently. MPI has no standard process control! • Specialized cases of a specific tool working with a specific environment • e.g., TotalView debugger working with MPICH. • The result is an m n combination of m process managers and n tools. Bottom line:need a standard interface for process managers and tools to coexist: the Tool Dæmon Protocol (TDP). Multicast/Reduction Network
The Basic TDP Steps • Create, but don’t start, new application process. • If necessary, create tool daemon process. • Pass basic information to tool daemon: e.g., • Application PID. • Front-end host/port number. • Standard I/O host/port number. • Tool daemon processes application: • For a debugger, read symbols • For Paradyn/dyninst, parse the executable. • Start the application process • Respond to changes in the application state. • Respond to changes in the tool daemon’s state. Multicast/Reduction Network
Challenge: Firewalls and Private Nets Remote Host Local Host Remote Process Manager Tool Front-End Process X Tool Dæmon Process Firewall Application Process Multicast/Reduction Network
Challenge: Firewalls and Private Nets Remote Host Local Host Remote Process Manager Comm Proxy Tool Front-End Process Tool Dæmon Process Firewall Application Process Multicast/Reduction Network
Challenge: Firewalls and Private Nets • When tool daemon is started, pass in the host/port number of its front-end process. • If there is a communication proxy, then: • Tool daemon will receive host/port of the proxy, so daemon connects to proxy. • Proxy will connect to the tool front-end, mapping the host/port (similar to NAT). • Application connecting to console for standard I/O works the same way. Multicast/Reduction Network
The Condor/Paradyn Scenario Remote Host Local Host Condor Starter Paradyn Front-End Paradyn Dæmon monitor/ control Application Process Application Process Multicast/Reduction Network
The Path Forward • Have produced a prototype implementation to expose technical challenges: • Parador: Paradyn running under Condor • Ana Cortes and Miquel Senar (UAB) • Goal is to produce a standard set of libraries for process managers and tool daemons. • Involve a wider community in this standards effort • Initially: ANL (Gropp and Lusk), Etnus (Cownie and Delsignore), Compaq, Paradyn, Condor. Multicast/Reduction Network
Tech Reports “MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools”, Philip C. Roth, Dorian C. Arnold, and Barton P. Miller. “The Tool Dæmon Protocol (TDP)”, Barton Miller, Ana Cortés, Miquel A. Senar, and Miron Livny http://www.paradyn.org/papers/ Multicast/Reduction Network