140 likes | 234 Views
Vishal Sharda Ashima Gupta. Comparing Topology based Collective Communication Algorithms. Work progress. No up-to-date open source solution available for fully mapping the network to build upon Nomad, Cheops, Nagios, ENV, Argus etc. outdated
E N D
Vishal Sharda Ashima Gupta Comparing Topology based Collective Communication Algorithms
Work progress • No up-to-date open source solution available for fully mapping the network to build upon • Nomad, Cheops, Nagios, ENV, Argus etc. outdated • Use of a third-party service to determine the topology • Java applet to map SERC network
Status of current study • Nearly all the research on this problem till now considers tightly-coupled systems. • Thus, most of the algorithms assume point-to-point connectivity. • So, these have to be modified to adapt to the network of heterogeneous workstations.
All to all broadcast • Also known as multinode broadcast • Generalization of one-to-all broadcast in which all the processors simultaneously initiate a broadcast. • Different processors may send out different message.
Existing Algorithms • Direct exchange • Circular all-to-all • E1 algorithm • Liquid Schedule • Algorithms for specific topologies like start and mesh.
Direct Exchange • Simplest approach • Assumes point to point connectivity • Each node simultaneously sends data to each other node • Involves lot of congestion
Circular all-to-all • Let p be the no. of nodes • For each node i for step k in {1..p}, node i sends to (i+k)mod p receives from(i-k+P)mod p
E1 algorithm • One node receives messages from all other nodes(becomes an expert) • Experts are formed by recursively doubling existing experts.
Liquid schedule algorithm • Traffic is the set of all collective exchanges. • A simultaneous sub traffic is part of the traffic involing non-congesting transfers. • Identify the bottleneck links in the network. • A liquid schedule is such that all the bottleneck links are utilized in every sub-traffic.
Need for Simulation • Deciding the packet to follow a particular route needs to bypass router decisions • Needs DLL operations. • Focus on comparing the algorithms • Incorporate simulation-based study like standard experiments.
Basis of Simulation • Execute and compare the algorithms for the network specifed at the interface. • Some algorithms will take bandwidth of the links into account. • Depending on the input, an algorithm may or may not show good results.
Implementation Approach • Network represented as weighted undirected graph with weights inversely proportional to bandwidth. • Simulating n processes on different nodes in a network with Java threads. • First of all, each node will compute the shortest path to all other nodes using single source shortest path algorithm.
Handling Collision • Lot of collision involved in all-to-all broadcast. • Several approaches possible like partitioning into subnets , choosing alternate link. • Our approach will be to stick to the shortest link and if collision detected then wait for random time and sense again.
References Papers : • "ECO: Efficient Collective Operations for Communication on Heterogeneous Networks",Bruce B. Lowekamp and Adam Begueliny. • "Network Topology Aware Scheduling of Collective Communications",Emin Gabrielyan, Roger D. Hersch. • "On General Results for all-to-all broadcast", Ming-Syan Chen et. al • “Efficient all-to-all broadcast in star graph interconnecion networks",Yu-Chee Tseng et.al Websites: • freemap.qualys.com