Distributed Resource Management and Parallel Computation Dr Michael Rudgyard Streamline Computing Ltd

Distributed Resource Management and Parallel Computation Dr Michael Rudgyard Streamline Computing Ltd

Streamline Computing Ltd • Spin out of Warwick (& Oxford) University • Specialising in distributed (technical) computing • Cluster and GRID computing technology • 14 employees & growing; focussed expertise in: • Scientific Computing • Computer systems and support • Presently 5 PhDs in HPC and Parallel Computation • Expect growth to 20+ people in 2003

Strategy • Establish an HPC systems integration company.. • ....but re-invest profits into software • Exploiting IPand significant expertise • First software product released • Two more products in prototype stage • Two complementary ‘businesses’ • Both high growth

Track Record (2001 – date..) • Installations include: • Largest Sun HPC cluster in Europe (176 proc) • Largest Sun / Myrinet cluster in UK (128 proc) • AMD, Intel and Sun clusters at 21 UK Universities • Commercial clients include Akzo Noble, Fujitsu, Maclaren F1, Rolls Royce, Schlumberger, Texaco…. • Delivered a 264 proc Intel/Myrinet cluster: • 1.3 Tflop/s Peak !! • Forms part of the White Rose Computational Grid

Streamline and Grid Computing • Pre-configured ‘grid’-enabled systems: • Clusters and farms • The SCore parallel environment • Virtual ‘desktop’ clusters • Grid-enabled software products: • The Distributed Debugging Tool • Large-scale distributed graphics • Scaleable, intelligent & fault tolerant parallel computing

‘Grid’-enabled turnkey clusters • Choice of DRMs and schedulers: • (Sun) GridEngine • PBS / PBS-Pro • LSF / ClusterTools • Condor • Maui Scheduler • Globus 2.x gatekeeper (Globus 3 ???) • Customised access portal

The SCore parallel environment • Developed by the Real World Computing Partnership in Japan (www.pccluster.org). • Unique features, that are unavailable in most parallel environments: • Low latency, high bandwidth MPI drivers • Network transparency: Ethernet, Gigabit and Myrinet • Multi-user time-sharing (gang scheduling) • O/S level checkpointing and failover • Integration with PBS and SGE • MPICH-G port • Cluster management functionality

‘Desktop’ Clusters • Linux Workstation Strategy • Integrated software stack for HPTC (compilers, tools & libraries) – cf. UNIX workstations • Aim to provide a GRID at point of sale: • Single point of administration for several machines • Files served from front-end • Resource management • Globus enabled • Portal • A cluster with monitors !!

The Distributed Debugging Tool • A debugger for distributed parallel application • Launched at Supercomputing 2002 • Aim is to be the de-facto HPC debugging tool • Linux ports for GNU, Absoft, Intel and PGI • IA64 and Solaris ports; AIX and HP-UX soon… • Commodity pricing structure ! • Existing architecture lends itself to the GRID: • Thin client GUI + XML middleware + back-end • Expect GRID-enabled version in 2003

Distributed Graphics Software • Aims • To enablevery large models to be viewed and manipulated using commodity clusters • Visualisation on (local or remote) graphics client • Technology • Sophisticated data-partitioning and parallel I/O tools • Compression using distributed model simplification • Parallel (real-time) rendering • To be GRID-enabled within e-Science ‘Gviz’ project

Parallel Compiler and Tools Strategy • Aim to invest in new computing paradigms • Developing parallel applications is far from trivial • OpenMP does not marry with cluster architecture • MPI is too low-level • Few skills in the marketplace ! • Yet growth of MPPs is exponential… • Most existing applications are not GRID-friendly • # of processors fixed • No Fault Tolerance • Little interaction with DRM

DRM for Parallel Computation • Throughput of parallel jobs is limited by: • Static submission model: ‘mpirun –np …..’ • Static execution model: # processors fixed • Scaleability; many jobs use too many processors ! • Job Starvation • Available tools can only solve some issues • Advanced reservation and back-fill (eg Maui) • Multi-user time-sharing (gang scheduling) • The application itself must take responsibility !!

Dynamic Job Submission • Job scheduler should decide the available processor resource ! • The application then requires: • In built partitioning / data management • Appropriate parallel I/O model • Hooks into the DRM • DRM requires: • Typical memory and processor requirements • LOS information • Hooks into the application

Dynamic Parallel Execution • Additional resources may become available or be required by other applications during execution… • Ideal situation: • DRM informs application • Application dynamically re-partitions itself • Other issues: • DRM requires knowledge of the application (benefit of data redistribution must outweigh cost !) • Frequency of dynamic scheduling • Message passing must have dynamic capabilities

The Intelligent Parallel Application • Optimal scheduling requires more information: • How well the application scales • Peak and average memory requirements • Application performance vs. architecture • The application ‘cookie’ concept: • Application (and/or DRM) should gather information about its own capabilities • DRM can then limit # of available processors • Ideally requires hooks into the programming paradigm…

Fault Tolerance • On large MPPs, processors/components will fail ! • Applications need fault tolerance: • Checkpointing + RAID-like redundancy (cf SCore) • Dynamic repartitioning capabilities • Interaction with the DRM • Transparency from the user’s perspective • Fault-tolerance relies on many of the capabilities described above…

Conclusions • Commitment to near-term GRID objectives • Turn-key clusters, farms and storage installations • On going development of ‘GRID-enabled’ tools • Driven by existing commercial opportunities…. • ‘Blue’-sky project for next generation applications • Exploits existing IP and advanced prototype • Expect moderate income from focussed exploitation • Strategic positioning: existing paradigms will ultimately be a barrier to the success of (V-)MPP computers / clusters !

Distributed Resource Management and Parallel Computation Dr Michael Rudgyard Streamline Computing Ltd