340 likes | 661 Views
Non-minimal Routing. Non-minimal routing Wormhole degrades performance while VCT has less secondary effects Fault tolerance is the main motivator Classes Search-based algorithms Virtual channel-based routing Turn-based routing. Non-Minimal Routing . Reading. Section 4.7 and/or
E N D
Non-minimal Routing • Non-minimal routing • Wormhole degrades performance while VCT has less secondary effects • Fault tolerance is the main motivator • Classes • Search-based algorithms • Virtual channel-based routing • Turn-based routing Non-Minimal Routing
Reading • Section 4.7 and/or • P.T. Gaughan, et al., “Distributed, deadlock-free routing in faulty, pipelined, direct interconnection networks,” IEEE Transactions on Computers, vol. 45, no. 6, pp.651-665, June 1996 • A. Mejia , J. Flich, J. Duato, Sven-Arne Reinomo and Tor Skeie, “Segment Based Routing: An Efficient Fault-Tolerant Routing Algorithm for Meshes and Tori,” Proceedings of the International Parallel and Distributed Processing Symposium, April 2006 • From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007
Backtracking Protocols • Backtracking search + resource reservation • Constrain the search • Minimal paths vs. #misroutes P.T. Gaughan, et al., “Distributed, deadlock-free routing in faulty, pipelined, direct interconnection networks,” IEEE Transactions on Computers, vol. 45, no. 6, pp.651-665, June 1996 Non-Minimal Routing
Optimization • Sensitive to choice of switching technique • Naturally suited to circuit switching and pipelined circuit switching • Overhead is large with SAF • Deadlock is avoided by not blocking on busy channels • Livelock is avoided by maintaining and using search history • In the header: large headers • In the routers: local state, headers comparable to e-cube • Protocol variations • Multi-links • k-family • exhaustive: profitable and misrouting • limited misrouting • multi-phase Non-Minimal Routing
Topology Agnostic Routing • Topology dependent vs. topology agnostic routing • Reliability • Increasingly important on-chip • Approaches • Techniques based on virtual channels • Expensive on-chip • Competes with QoS schemes • Techniques based on Turn restrictions • Difficult to ensure non-minimal paths Topology Agnostic Routing
Segment Based Routing • Topology agnostic routing • Restriction-based approach • Multiple restriction options • Select restrictions based on performance goals • Source based routing • Routing table generation From A. Mejia , J. Flich, J. Duato, Sven-Arne Reinomo and Tor Skeie, “Segment Based Routing: An Efficient Fault-Tolerant Routing Algorithm for Meshes and Tori,” Proceedings of the International Parallel and Distributed Processing Symposium, April 2006. Topology Agnostic Routing
Key Idea: Segments & Subnets • Partition topology into subnets and then segments in a subnet • Goal: islands of regularity Topology Agnostic Routing
Key Idea: Optimization • Placement of Turn restrictions in a segment • Placement for latency shortest path • Placement for throughput distribute traffic • Topology segmentation • Can be optimized for regular topologies Topology Agnostic Routing
Requirements • Avoid deadlock in a segment • Avoid deadlock when traversing multiple segments • Ensure routing connectivity when physical connectivity exists • Avoiding congestion in path construction Topology Agnostic Routing
Construction of Segments Starting node • Search for starting segment + “regular” segments • Unitary segments • Add one bidirectional restriction in each segment Failed links Terminal node Unitary segment bridge segment Topology Agnostic Routing
Segment Types Starting node • Starting segment • Regular segment • Unitary segment Failed links Terminal node Unitary segment bridge segment Topology Agnostic Routing
Deadlock Freedom • One routing restriction per segment • No cycles in a segment • Every cycle contains a segment • Hence cannot be “closed” to create deadlock • No cycle from the start node back to itself • Cannot create cycles across subnets • Think of a subnet as a union of 1-D segments Topology Agnostic Routing
Example From A. Mejia , J. Flich, J. Duato, “On The Potential of Segment Based Routing” Proceedings of the International Conference on Parallel Processing 2008
Routing • Segment routing is turn based and therefore partially adaptive • Source routing can be layered on top of segments to balance traffic From A. Mejia , J. Flich, J. Duato, “On The Potential of Segment Based Routing” Proceedings of the International Conference on Parallel Processing 2008
Performance From A. Mejia , J. Flich, J. Duato, Sven-Arne Reinomo and Tor Skeie, “Segment Based Routing: An Efficient Fault-Tolerant Routing Algorithm for Meshes and Tori,” Proceedings of the International Parallel and Distributed Processing Symposium, April 2006. Topology Agnostic Routing
S D Region-Based Routing • Recognize that routing decisions implicitly check for region membership • Think meshes • Generalize the idea of regions • Can naturally be adapted for fault tolerant routing From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Example of Regions • Static, off-line topology characterization • Online querying of network structure • Built on segment-based routin {node set} {node set} From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Example of Regions • What are the characteristics of these regions? • Note #regions = f(routing options) • Note use of output port depends on input port • Check W output port from N input port {node set} {node set} From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Approach • Observe that table-based routing is really region based • Each entry identifies a region • Merge entries into compact region specifications at each switch • Region construction is based on the paths • Any set of paths fault tolerant routing • No virtual channels Topology Agnostic Routing
Key Idea • Generate paths • All minimal • First non-minimal • Note: using SR routing • Record paths at each router • Produce region representation for each output port • Record input port dependencies • Program Routers From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Coalesce routing options based on inputs and outputs Represents a compact routing table Creating Regions From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Region Construction Segment Routing @each node @each node Search Coalesce & Packing Region Formation • Note this can be applied to any topology • No virtual channels • Offline optimization of latency vs. distance From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Hardware Overheads • Each region requires • Four registers that define the region • Mask registers to define input and output ports • Logic to determine routing options • Hardware cost grows as the number of regions • Growth as f(network_size) is much slower From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Implementation • Initialization of region registers and parallel evaluation of all regions From J. Flich, A. Mejia, P. Lopez, and J. Duato, “Region-Based Routing: An Efficient Routing Mechanism to Tackle Unreliable Hardware in Networks on Chip,” Proceedings of the First International Symposium on Networks on Chip, May 2007 Topology Agnostic Routing
Microarchitecture Issues • Routing algorithm performance is sensitive to resource allocation schemes in the router • Key resource management functions include • Routing function • Selection function • Arbitration/scheduling • Mismatch can lead to poor performance
Resource Allocation: Selection Functions VC status/control • Selection function may be oblivious or informed • Common to favor minimal paths and lightly loaded links • Examples: • Meshes: minimum congestion, maximum flexibility, straight lines • Unlike routing functions, selection function must be serialized • Result updates the channel status – a centralized resource Routing function VC buffer Selection function Input VC Output VCs
Selection Functions • Favor adaptive channels • Improve probability of escape channel availability • Time dependent selection functions: give adaptivity a chance • Selection functions for real time traffic • Separate best effort and guaranteed packets via VCs or virtual networks • Note the impact on bisection utilization • Selection functions for cache coherent systems?
Resource Allocation: Arbitration arbitration • Tradeoffs: channel bandwidth vs. message sizes and types • Mix of buffering strategies across message types • All three strategies must be co-designed for a tuned system Message size Flow control
Routing, Selection & Arbitration • Input driven vs. output driven scheduling • Output driven scheduling requires replication of routers amongst inputs • Lessons from the microprocessor world • Impact of complexity, workloads, and concurrency • Impact of…. • Symmetry of the topology • Locality of traffic • Packet size Locality, uniformity deterministic routing adaptive routing Irregular, hot spot
Characterization of Techniques • Deadlock freedom achieved by • Path based techniques • Restrict paths • Buffer based techniques • Structured buffer pools • Channel based techniques • #VCs independent of the network
Summary • Best routing algorithm driven by multiple considerations Hot spots Deterministic vs. adaptive Packet sizes Compatible micro-architecture Locality of traffic Uniform vs. non-uniform traffic Power envelope On-chip vs. off-chip Symmetric vs. asymmetric topology