120 likes | 209 Views
Communication Pattern Based Node Selection for Shared Networks. Srikanth Goteti Interactive Data Corp Jaspal Subhlok University of Houston AMS Symposium 2003. Resource Selection for Network/Grid Applications. Model. Data. GUI. Sim 1. Pre. Stream. Application. ?.
E N D
Communication Pattern Based Node Selection for Shared Networks Srikanth Goteti Interactive Data Corp Jaspal Subhlok University of Houston AMS Symposium 2003
Resource Selection for Network/Grid Applications Model Data GUI Sim 1 Pre Stream Application ? where is the best performance Network
Current Approaches to Node Selection Model Data GUI Sim 1 Pre Stream • Measure and model network properties, such as available bandwidth and CPU loads (with tools like NWS) • Find “best” nodes for execution based on network status • But expected application performance based on measured network status may not be accurate • depends on application characteristics • translation, e.g., unused bandwidth vs expected throughput • data may be stale as frequent measurements are expensive
Performance Skeleton Performance Skeletonis a synthetic short running program whose execution characteristics mirror the application it represents An application and its skeleton have similar • communication pattern • synchronization pattern • CPU usage • memory usage Goal: Performance of a skeleton is directly related to the performance of the application under any condition • e.g., a skeleton executes in .1% of the time the application takes to execute on any part of a shared network
Model Data GUI Sim 1 Pre Stream Construct a skeleton for application of interest Node Selection with Performance Skeletons Model Data GUI Sim 1 Pre Stream Select candidate node sets based on network status Execute the skeleton on them Select the node set with best skeleton performance to schedule actual application
Node Selection Procedure • Construct a performance skeleton • mostly by hand in this paper, subject of ongoing work • Select candidate node sets • identify the communication graph of the application • typically a chain, ring or all-all structure • obtain available bandwidth between nodes with NWS (Network Weather Service) and build a graph • select nodes to “maximize the minimum available bandwidth” between pairs of communicating nodes • best possible node sets based on application structure and network status • Execute the skeleton on each candidate node set • Select the node set with best skeleton performance, map one process to each node
1 0 2 3 CG Communication Structure of NAS Benchmarks 1 1 0 0 2 3 3 2 BT IS 1 1 1 0 0 0 2 2 2 3 3 3 LU MG SP 1 0 2 3 EP
1 0 2
Validation Experiments Best nodes to execute benchmarks selected by each of the following methods… • skeleton based: full framework discussed • all to all: based on maximizing the minimum available bandwidth between on the network graph • random …compare performance of the application on nodes selected by each of these procedures on a busy network • Experiments repeated a large number of times to get statistically meaningful results
Experimental Framework • Linux cluster of 10 dual CPU 1.7GHz Pentium nodes connected by 100 MHz links and crossbar switch • experiments with Class B NAS MPI benchmark suite • Class W NAS benchmarks (avrg runtime ~1.5 seconds on our cluster) used as skeletons for class B benchmarks • available bandwidth between nodes is varied with Linux iproute2 for the duration of experiments as follows: • path between a pair of nodes is “shared” by S streams • i.e., available bandwidth is set to 1/S of peak • one stream is randomly added to or removed from the cluster every 30 seconds
1 0 2 3 CG Performance Results: slowdown due to network traffic • skeleton based has average slowdown of 20%, versus 40 % for random and 27% for all to all • significant variation across benchmarks, most benefit for CG – it is communication heavy and uses only 3 links
Conclusions type slide • Performance skeletons have a role in resource management for grids • removes limitations of using NWS type systems (what you measure versus what you get problem) • A lot more experimentation is needed to establish and validate the concepts • Automatic construction of performance skeletons is a major open challenge • Skeletons may have other uses a fast way of estimating the performance of an application • e.g. on a slow simulated future system