420 likes | 929 Views
Networks-on-Chip. Ben Abdallah Abderazek The University of Aizu, Graduate school of Computer Science and Eng, Adaptive Systems Laboratory, E-mail: benab@u-aizu.ac.jp. 03/01/2010.
E N D
Networks-on-Chip Ben Abdallah Abderazek The University of Aizu, Graduate school of Computer Science and Eng, Adaptive Systems Laboratory, E-mail: benab@u-aizu.ac.jp Hong Kong University of Science and Technology, March 2010 03/01/2010
Part IINoC topologies NoC Switching strategiesRouting algorithmsFlow control schemesClocking schemesQoSBasic Building Blocks Status and Open Problems Hong Kong University of Science and Technology, March 2010
NoC Routing Algorithms Responsible for correctly and efficiently routing packets or circuits from source to destination • They must prevent deadlock, livelock, and starvation • Choice of a routing algorithm depends on: • Minimizing power required for routing • Minimizing logic and routing tables Hong Kong University of Science and Technology, March 2010
Routing Algorithm Classifications Three different criteria: • Where the routing decision are taken • Source routing • Distributed routing • How a path is defined • Static (deterministic) • Adaptive • The path length • Minimal • Nonminimal • Routing schemes: • Static, • Dynamic, • Distributed, • Source routing, • Minimal, and non-minimal routing Adaptive Systems Laboratory, Univ. of Aizu
NoC Routing-Table • The Routing Table determines for each PE the route via which it will send packets to other PEs. • The routing table directly influences traffic in the NoC. • Here we can also distinguish between 2 methods: • Static routing • Dynamic (adaptive) routing Adaptive Systems Laboratory, Univ. of Aizu
Static Routing The Routing Table is constant. The route is embedded in the packet header and the routers simply forward the packet to the direction indicated by the header The routers are passive in their addressing of packets (simple routers) Hong Kong University of Science and Technology, March 2010
Dynamic Routing • The routing table can change dynamically during operation • Logically, a route is changed when it becomes slow due to other traffic • Possibly out-of-order arrival of packets. • Usually requires more virtual channels. • In this method we can identify 2 systems: • Routing altering decisions are made in the routers (smart routers) • Routing altering decisions are made in a dedicated central unit that receives traffic information from all the routers and can decide to change the routing table. Adaptive Systems Laboratory, Univ. of Aizu
Dynamic Routing Packet Packet X Adaptive routing method More resources needed to monitor state of the network Hong Kong University of Science and Technology, March 2010
Routing Algorithms Requirements • Routing algorithm must ensure freedom from deadlocks • e.g. cyclic dependency shown below • Routing algorithm must ensure freedom from livelocks and starvation Hong Kong University of Science and Technology, March 2010
Part IINoC topologies Switching strategiesNoC Routing Flow control schemesClocking schemesQoSBasic Building Blocks Status and Open Problems Hong Kong University of Science and Technology, March 2010
Flow control schemesSTALL/GO • Low overhead scheme • Requires only two control wires • One going forward and signaling data availability • the other going backward and signaling either a condition of buffers filled (STALL) or of buffers free (GO) Hong Kong University of Science and Technology, March 2010
Flow control schemesT-Error • More aggressive scheme that can detect faults • by making use of a second delayed clock at every buffer stage • Delayed clock re-samples input data to detect any inconsistencies • then emits a VALID control signal • Resynchronization stage added between end of link and receiving switch Hong Kong University of Science and Technology, March 2010
Flow control schemesACK/NACK When flits are sent on a link, a local copy is kept in a buffer by sender When ACK received by sender, it deletes copy of flit from its local buffer When NACK is received, sender rewinds its output queue and starts resending flits, starting from the corrupted one Implemented either end-to-end or switch-to-switch Hong Kong University of Science and Technology, March 2010
Part IINoC topologies Switching strategiesRouting algorithmsFlow control schemesClocking schemesQoSBasic Building Blocks Status and Open Problems Hong Kong University of Science and Technology, March 2010
Clocking schemes • Fully synchronous • Single global clock is distributed to synchronize entire chip • hard to achieve in practice, due to process variations and clock skew • Mesochronous • Local clocks are derived from a global clock • Not sensitive to clock skew • Pleisochronous • clock signals are produced locally • Asynchronous • clocks do not have to be present at all Hong Kong University of Science and Technology, March 2010
Clocking schemes CMU PE NI SYNC SW SYNC SYNC • Mesochronous • Local clocks are derived from a global clock • Not sensitive to clock skew Hong Kong University of Science and Technology, March 2010
Part IINoC topologies Switching strategiesRouting algorithmsFlow control schemesClocking schemesQoSBasic Building Blocks Status and Open Problems Hong Kong University of Science and Technology, March 2010
Quality of Service (QoS) • QoS refers to the level of commitment for packet delivery • Three basic categories • Best effort (BE) • Only correctness and completion of communication is guaranteed • Usually packet switched • Guaranteed service (GS) • makes a tangible guarantee on performance, in addition to basic guarantees of correctness and completion for communication • Usually (virtual) circuit switched • Differentiated service • prioritizes communication according to different categories • NoC switches employ priority based scheduling and allocation policies Hong Kong University of Science and Technology, March 2010
Part IINoC topologies Switching strategiesRouting algorithmsFlow control schemesClocking schemesQoSBasic Building Blocks Status and Open Problems Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Hong Kong University of Science and Technology, March 2010
Basic NoC Building BlocksPacket format Message, Packet and Flit Formats Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks NoC Queuing Schemes C R O S S B A R Input 0 Output 0 Input 1 Output 1 Input 3 Output 2 Input 3 Output 3 HOL blocking problem in Input Queuing Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Flow Control Schemes Stop threshold Packet enable Receiver Stop Buffer is occupied Transmitter Packet Buffer is released Stop & Go Flow Control Go threshold Minimum Buffer Size = Flit Size x ( Roverhead + Soverhead + 2 x Link delay) Roverhead: the required time to issue the stop signal at the received router Soverhead : the required time to stop sending a flit as soon as the stop signal is received Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Flow Control Schemes Stop threshold Packet enable Receiver Stop Transmitter Credit is decremented A flit is transferred Buffer is released Credit is incremented Credit Based (CB) Flow Control • CB makes the best use of channel buffers • Can be implemented regardless of the link length of the sender & receiver overhead Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Queue and Buffer Design Intermediate Empty Bubble Packet IN Packet out D Q D Q D Q D Q D Q sh sh sh sh sh Conventional Shift Register Method Effective bandwidth of Data link later is influenced by the traffic pattern and Q size Q Buffers consume most of the area and power among all NoC building blocks. Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Network Interface • Different interface shall be connected to the network • The network uses a specific protocol and all traffic on the network has to comply to the format of this protocol Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Network Interface • In order to allow for different resources to connect to the network, the network interface can be divided into • A resource independent part (Network Interface) • A resource dependent part (Resource Network Interface) Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Bidirectional link 36 bits 36 bits Tag Data • 2 * 32-bits data links • Asynchronous, credit based flow control • Easy floorplan routing & timing in DSM process Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Router Design Hong Kong University of Science and Technology, March 2010
Basic NoC Building Blocks Router design with cross point 4-bit Output port Scheduler request 1 2 3 4 Grant Input Queues grant IQ1 IQ1 Switch Fabric ph-1:0 IQ1 phit-1:0 IQ1 Output Queues (Optional) OQ1 OQ1 OQ1 OQ1 Hong Kong University of Science and Technology, March 2010
Basic NoC Building BlocksScheduler Design n Grant AnyGnt PPE log n Round-roubin Algorithm circuits with 2 priority ENC INCI anyGnt Simple PE Req n n Termo encoder Simple PE P_enc Gnt T.E. n log n n n Hong Kong University of Science and Technology, March 2010
Basic NoC Building BlocksPhit Size Determination PE PE NI NI Buffers phit size = packet size/SERR SER/DES Buffers phit size = packet size Operation freq = SERR* fNORM Operation freq = fNORM Switch Switch The phit size is the bit width of a link and determines the switch area Hong Kong University of Science and Technology, March 2010
NoC ExamplesÆthereal • Developed by Philips • Synchronous indirect network • WH switching • Contention-free source routing based on TDM • GT as well as BE QoS • GT slots can be allocated statically at initialization phase, or dynamically at runtime • BE traffic makes use of non-reserved slots, and any unused reserved slots • also used to program GT slots of the routers • Link-to-link credit-based flow control scheme between BE buffers • to avoid loss of flits due to buffer overflow Hong Kong University of Science and Technology, March 2010
NoC ExamplesMore… HERMES - Developed at the Faculdade de Informática PUCRS, Brazil MANGO Nostrum - Developed at KTH in Stockholm Octagon - Developed by STMicroelectronics QNoC - Developed at Technion in Israel Xpipes Developed by the Univ. of Bologna and Stanford University OASIS – Developed by the Adaptive Systems Lab, UoA, Japan (Our Group) … Hong Kong University of Science and Technology, March 2010
Part IINoC topologies Switching strategiesRouting algorithmsFlow control schemesClocking schemesQoSBasic Building Blocks Status and Open Problems Hong Kong University of Science and Technology, March 2010
Status and Open Problems • Power • Complex NI and switching/routing logic blocks are power hungry • Latency • Additional delay to packetize/de-packetize data at NIs • Flow/congestion control and fault tolerance protocol overheads • Delays at the numerous switching stages encountered by packets • Even circuit switching has overhead (e.g. SOCBUS) • Lack of tools and benchmarks • Simulation speed • GHz clock frequencies, large network complexity, greater number of PEs slow down simulation Hong Kong University of Science and Technology, March 2010
Trends • Move towards hybrid interconnection fabrics • NoC-bus based • Custom, heterogeneous topologies • New interconnect paradigms • Optical • Wireless • Carbon nanotube Hong Kong University of Science and Technology, March 2010
NoC research community Academe and industry VLSI / CAD people Computer system architects Interconnect experts Asynchronous circuit experts Networking/Telecomm experts Hong Kong University of Science and Technology, March 2010
Research Topics • Speed enhancement • New router architectures (e.g. faster arbitration) • Different asynchronous protocols employment • Support of link varying capacity • Low-cost Serialization/De-serialization • “Standardization” of Network Interface • Packet construction/destruction • Testing/Verification of A-NoC Hong Kong University of Science and Technology, March 2010
Summary NoC is a scalable platform for billion-transistor chips Several driving forces behind it Many open research questions May change the way we structure and model VLSI systems Hong Kong University of Science and Technology, March 2010
Networks-on-Chip Ben Abdallah Abderazek The University of Aizu, Graduate school of Computer Science and Eng, Adaptive Systems Laboratory E-mail: benab@u-aizu.ac.jp Hong Kong University of Science and Technology, March 2010