220 likes | 512 Views
On-Chip Networks and Testing. Basic Network Architectures. Shared Medium - Buses Direct Networks (More follows) Indirect Interconnection Networks: A network of switches connecting all the nodes, e.g. Omega networks (not further discussed here)
E N D
On-Chip Networks and Testing Ob-Chip Networks
Basic Network Architectures • Shared Medium - Buses • Direct Networks (More follows) • Indirect Interconnection Networks: A network of switches connecting all the nodes, e.g. Omega networks (not further discussed here) • Hybrid: Multiple and Hierarchical Buses, FPGAs (e.g. Virtex II) Ob-Chip Networks
Buses • All communication devices time-share a common transmission medium • Broadcast is possible because every bus device can listen to all communication. • Need bus arbitration to resolve simultaneous requests for bus use. • Can be centralized or distributed Ob-Chip Networks
Multiple Buses Ob-Chip Networks
Hierarchical Bus Ob-Chip Networks
Bus Performance Issues • Arbitration overhead must be minimized • Response time of slow slaves: Solution is split-transaction protocols • Master releases bus after request • Slave must regain it • To improve bus utilization and latency caused by simultaneous requests, bus can partition large transfers into smaller packets. Ob-Chip Networks
On-Chip Bus Performance Issues • Scalability • Speed: Every new device adds to the load and thus slows the bus. Problem shows up beyond c. 10 bus masters. • Latency also grows with the number of devices • Energy Efficiency: Every data transfer is broadcast to all devices hence consumes more energy as the number of devices grows. Ob-Chip Networks
Direct Interconnection Networks • Also called point-to-point • Overcome scalability problem of buses • Each node directly connects a limited number of neighboring nodes. • Each node includes a network interface called router that handles communication and directly connects to routers of neighboring nodes. • Total bandwidth increases with number of nodes Ob-Chip Networks
Communication Latency is a critical performance parameter. Sum of: Start-up latency Network latency Blocking time A Generic Direct Network A generic system with direct interconnect network Generic node architecture Ob-Chip Networks
Some Direct Network Topologies (From: Ni & McKinley, IEEE Computer, Feb 1993) (1) Hypercube (2) Torus (3) Mesh Many other topologies and variants appear in the literature. These are evaluated on the following parameters: Node degree (# edges per node), Diameter (greatest distance across), Bisection width (lines cut by slicing), Latency (time to reach other nodes), Bandwidth (data rate), Symmetry (network same everywhere), Homogeneity (all nodes same) Ob-Chip Networks
Many ways to classify: Source routing: Source node selects the path to the destination, stays fixed. Each packet carries the complete path info. Distributed Routing: Router determines if local or non-local. Uses a routing algorithm if non-local. Routing algorithm must be fast and easy-to-implement. Deterministic vs. Adaptive: Deterministic: Path determined by S and D Adaptive: Path can change dynamically A routing algorithm is minimal if it selects a shortest path in the network. Non-minimal routing must avoid deadlock Routing in Direct Networks Ob-Chip Networks
The network consists of many channels and buffers. Flow control determines how channels or buffers are allocated to a packet as it travels through the network. If a resource (channel or buffer) required by a packet is held by another packet, flow control determines if the packet is dropped, blocked in place, buffered, or rerouted. Good flow control algorithm aim to avoid network congestion. Flow Control in Direct Networks Ob-Chip Networks
Switching in Direct Networks • Switching: Mechanism to remove a packet from an input channel and putting it on an output channel. • Four general techniques: • Store-and-Forward (Packet Switching): Entire packet stored in a packet buffer at intermediate nodes, then forwarded to a selected neighbor. • Circuit Switching: A complete circuit from S to D is built and torn down for each packet transfer • Virtual Cut through: Packet is buffered at an intermediate node only if the next required channel is busy, otherwise it is forwarded directly without buffering. • Wormhole routing: Similar to cut-through but packet broken into flits (flow control digits) that are normally transmitted bit-parallel between routers. The header flit governs the route. As it advances along the route, the remaining flits follow in pipeline way. The header flit may be blocked at an intermediate node. The trailing flits remain in flit buffer along the route. Once the channel is acquired by a packet, it is reserved for the packet, and released when the tail flit has been transmitted. Ob-Chip Networks
Comparison of Switching Techniques • (From: Ni and McKinley, Computer, Feb 1993) • Store and Forward (2) Circuit Switching (3) Wormhole Ob-Chip Networks
An Example Deadlock in Wormhole Routing Ob-Chip Networks
Adaptive Double Y-channel Routing for 2D Mesh (From: Ni and McKinley, Computer, Feb 1993) Ob-Chip Networks
Why Direct Networks (NoC) for On-Chip Networks? • Dally and Towles (DAC2001) provide the following benefits of replacing global wiring by direct networks: • Structure: Global wires are structured so as to optimize and control their electrical properties: cross-talk is minimized and becomes predictable, • Performance: • Aggressive signaling techniques are possible to reduce power and increase speed. • Sharing wiring between many flows makes wire use more efficient (typical activity is 10%). • Modularity: Direct network provide a standard interface for plug-and-play designs. Standard interface also facilitates reusability and interoperability of modules (nodes). Ob-Chip Networks
An Example Design (From Dally and Towles) • 12 mm x 12 mm chip • 0.1 micron technology (0.5 micron wire pitch) • 16 3mmx3mm tiles (CPUs, DSPs, I/O controllers, memory subsystems, etc.) • All communication via network logic • 2D folded torus topology • Network logic occupies small amount of area between tiles (6.6%) and consumes top two metal layers • It provides a reliable datagram interface to each tile. Ob-Chip Networks
NoC vs. Inter-chip Networks • Wires and pins are more abundant in NoCs. • Buffer space is less abundant. • Based on the above: Dally and Towles identify three future research areas: • What topologies are best matched for abundant wiring resources? • What flow control methods reduce buffer count and overhead? • What circuits best exploit structured wiring? Ob-Chip Networks
Topology • In the example design each tile side can accommodate 6000 wires, hence it is possible to achieve 24000 pins crossing the four edges. Compare with 1000 pins for inter-chip routers, limited by pins. • The design uses 300-bit flits (compared to 8 or 16 bit fits for inter-chip routers) • The folded torus has twice the bisection width of a mesh. Ob-Chip Networks
Flow Control • The example design uses 10K bits of buffer space in each input controller, thus does not particularly economize buffer space. Large buffering was dictated by the requirement of not dropping packets on collision, for performance reasons. Ob-Chip Networks
Circuits Used to Exploit Structured Wiring • Pulsed low-swing drivers and receivers: • Low power • Reduced latency • Increased repeater spacing • Circuits can be used to send multiple bits per clock period on one wire. For the example one could send 2-20 bits depending on the clock rate (2 GHs to 200 MHz). Ob-Chip Networks