1 / 45

NETWORK ON CHIP ROUTER

NETWORK ON CHIP ROUTER. Final presentation part B. Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter. Spring 2006. Agenda. NoC concept NoC’s Router design architecture & implementation (Short review of part-A) NoC design

varden
Download Presentation

NETWORK ON CHIP ROUTER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NETWORK ON CHIP ROUTER Final presentation part B Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter Spring 2006

  2. Agenda • NoC concept • NoC’s Router design architecture & implementation (Short review of part-A) • NoC design • NoC demonstration and waveforms

  3. NoC Concept NoC Concept

  4. Problem: Power, size and performance not practical for multi-processor chips using a single bus interconnection Solution: Network on Chip, based interconnection: fast, reliable data and low power consumption

  5. Project Goals • Implement a router for NoC in VHDL based upon research made by faculty members • Design and implement interface unit between NoC routers and process units • Design and implement a NoC based system connected to several processing units

  6. Interface Interface Interface Interface Interface Interface Interface Interface Interface Processing Unit Network Scheme NoC ROUTER ROUTER ROUTER ROUTER ROUTER ROUTER ROUTER ROUTER ROUTER

  7. Network problems • The NoC/Router should deal with these issues • Transmitting large data (buffer size, latency) • Loss of data due to full buffer • Data priority • Data from several inputs to one output • Blocked path prevents data transmission

  8. Data in network • Packet based data flow • Single-flit or multi-flit packets (wormhole) • Service level support (packet priority) • Virtual channel support (router feature)

  9. Packets in Wormholearchitecture • Each Packet is divided to Several smaller segments (flits) • Each flit is several bit width A whole Packet Packet body Target Address Command

  10. Process Unit Process Unit Process Unit Process Unit Process Unit Interface Interface Between Bus & Router Local Bus To Router From Bus to Packets Processing Unit “local bus protocol” “network language” Packets in Wormhole architecture

  11. Design Architecture Design Architecture

  12. Input port Router 5x5 Processing Unit Processing Unit ROUTER North Interface West Module Crossbar Data East Control next router Data output port Control South

  13. Flit structure Lsb bit CRT (first flit of packet) x 4bit (network size factor) x 12bit Or 8bit – first flit of packet one flit Data out x 2bit Type x 2bit Service Level Msb bit

  14. 4 msb bits of every flit Service Level (SL) : Block Transfer = “00” (unimplemented yet) Real-Time = “01” (unimplemented yet) RD/WR = “10” (implemented as SL2) Signaling = “11” (implemented as SL1) Data Type : Idle = “00” – no signal Body = “01” – middle/header flit of a packet EP = “10” – end of packet FP = “11” – full packet (single flit packet)

  15. Router designarchitecture guidelines • Input port serves as register-buffer • Crossbar designed in a full-connectivity (all inputs to all output) • Output port recognizes data waiting in input port buffer to be delivered, and extracts it (read command from output port goes to one or more inputs)

  16. Routing Control & Terminology • Each Service Level (SL) has an independent path from input port until output port buffer • According to Current Routing Table (CRT) data is channeledto the appropriate output port, via the crossbar • In each output port the Current Serviced Input Port (CSIP) is determined by a Round-Robin logic which arbitrates between input ports waiting for transmission • Next Buffer’s State (NBS) determines if next router is ready to receive data

  17. FIFO FIFO CRT CRT Router Input Port Current Routing Table ROUTER • Input • Port Crossbar PREVIOUS ROUTER IN PATH / PROCESSING UNIT Data Per Service- Level Data In flits Switching SL SL CRT read/write Control Buffer credits Read lines From output- ports

  18. Crossbar Example for 2 Service Levels SL1 To Output port From Input Port DATA + CRT DATA SL2 SL1 To Output Port From Input Port DATA + CRT DATA SL2 • CRT Routing DATA according to CRT CSIP

  19. CSIP CSIP NBS NBS Round-Robin & control Router Output Port Currently Serviced Input Port Next Buffer State ROUTER • Output • Port Crossbar NEXT ROUTER IN PATH / PROCESSING UNIT Data Per Service- Level Data In flits Switching SL CSIP Read lines To input- ports Buffer credits Buffer credits

  20. Part A – Final Router Design • Modular 5x5 Router, 16 bit data width • 2 Service Levels (SL1, SL2) • Receiving/Transmitting Multi-flit packets (not only FP*) * FP = Full-Packets

  21. NoC’s Router top view : crossbar output port Input port

  22. Simulation conclusions :(Part A – Router only) • Latency = 3 clock cycles for a flit • Throughput = outputs 1 flit on each clock cycle (no wasted cycles) • No latency in output transition when between SL or in switching between inputs • Router can handle receiving & transmitting of multi-flit packets of different SL, from and to different inputs & outputs simultaneously

  23. Part B Goals • Implement Router On Virtex II Pro • Validating & Extracting design parameters (area, timing) – Done • Ramp up understanding of PPC architecture or Micro-Blaze for connecting user-logic in order to build a QNoC based application on the Virtex II Pro – Done • Design, implement & simulate interface – Done • Implement a QNoC based application on a FPGA for validation of NoC – Done • Add virtual channel to existing router – Not implemented

  24. NoC Implementation NoC Implementation

  25. Router – synthesis parameters

  26. System Implementation System Components: • 2x2 NoC, each router on the network is connected to a processing unit in order to validate NoCs activity • PPC connected to NoC IP through OPB IPIF Module • Chipscope ILA’s for signal check and analysis

  27. Router (part-A) Router (part-A) Router (part-A) Router (part-A) NoC based Multi-core scheme Xilinx XUP Virtex™-II Pro Development System NoC Processing Module PPC & peripherals (just as another module) PC – User Interface Through UART and Chipscope Analyzer Processing Module Processing Module

  28. OPB_UART PLB2OPB Bridge IPIF (IPIC) Embedded system diagram (SOPC) : PLB OPB PLB_BRAM Custom Peripheral DPLB Custom Peripheral PPC IPLB

  29. Custom peripheral connection Reset IP CORE through IPIC – through code command Write and read FIFO for data transaction between IP CORE and IPIC

  30. IP CORE - NoC Design Interface to PPC as another Module connected to the NoC 2x2 NoC Modules connected to the NoC

  31. NoC – synthesis parameters • Tight timing parameters and over-resources  • Lets try to make the NoC a bit “slim” by reducing SL Chip out of resources Tight timing

  32. NoC 1SL – synthesis parameters • Good timing parameters, and have the resources for it  Tight timing, but in PPC limits

  33. IPIF2NoC Interface Considerations: • WR/RDFIFO communication protocol • Buffer credit calculation • Wormhole architecture

  34. WRFIFO read protocol(Implemented in vhdl text) • WRFIFO write is done by IPIC

  35. RDFIFO write protocol(Implemented in vhdl text) We need to WrReq only for valid ≠0 data • RDFIFO read is done by IPIC

  36. IPIF RDFIFO protocol • WrReq only for existing valid data ( ≠0 according NoC/Router protocol)

  37. IPIF RDFIFO Buffer state(Initialize Router to n-1 places left on RDFIFO) • Calculates if RDFIFO is not full so Output-port can transmit

  38. Router Input Buffer State(calculate Full state to WRFIFO) • Calculates Input-port NBS, if it’s not full so WRFIFO can transmit

  39. Future optional Enhancements • Debug router design • Adding Virtual-Channel to Router architecture • Adding wormhole architecture to the Interface architecture

  40. NoC Implementation conclusions and summary NoC based upon developed router from part A simulated, synthesized and works well • Latency and throughput as expected form router design • multi-flit packets • Unexpected result in switching between ports and different SL, as opposed to excellent behavior in simulations. • The implementation of the part-A designed Router took more logic-area than expected.

  41. Project conclusions • Learned a lot form the NoC project, on the NoC and Router concepts and architectures • Learned a lot on designing a system and not only looking at 1 object at a time • Improved our design and VHDL capabilities, and experienced in the “real world” hardware. • Debug tools aren’t so friendly for the first-time user, and design references should be more available.

  42. Design Demonstration Design Demonstration

  43. Router 11 (address 1010 Router 00 (address 0000) Router 01 (address 0100) Router 10 (address 0001) Calculator QNoC based multi-core system Xilinx XUP Virtex™-II Pro Development System UART PPC - Running pre-defined code PC – User Interface Through UART and Chipscope Analyzer OPB PPC PLB IPIC IPIF On-Board memory Inverter (Buffer) Interface to PPC through opb-ipif Calculating Chipscope Control+ILA’s XOR With ‘1’

  44. Calculator waveform - Full * Press on shadowed boxes for zoom

  45. THANK YOU

More Related