110 likes | 220 Views
Network on a Chip: An Architecture for the Billion Transistor Era. Royal Institute of Technology, Stockholm Jönkoping University, Jönkoping University of Queensland, Brisbane Ericsson Radio Systems, Stockholm. A. Hemani, A. Jantsch , S. Kumar, A. Postula,
E N D
Network on a Chip:An Architecture for the Billion Transistor Era Royal Institute of Technology, Stockholm Jönkoping University, Jönkoping University of Queensland, Brisbane Ericsson Radio Systems, Stockholm A. Hemani, A. Jantsch, S. Kumar, A. Postula, J. Öberg, M. Millberg, D. Lindqvist
The Problem Design Productivity Gap Moore’s Law Progress in Design Automation
And it is not just more gates . . . 2000 Funtionality Testability Wire Delay Power Management Embedded Software More design choices (HW, mP, DSP, FPGA,…) Signal Integrity RF Hybrid Chips Packaging 1970 Funtionality Testability
Methodologies & Platforms • Behavioural synthesis • Solves an insignificant problem today. • Will eventually replace and/or subsume RTL synthesis. • IP/VC based design method • 200-400 IP/VC blocks of 100k gates required in .1 micron era. • Interface design too big a problem. • Platform based design • A step in the right direction. • Platforms • Bus based interconnect scheme will not scale • FPGAs point in the right direction. Low granularity.
The Emerging Platforms & Architectures Hardwired computation Hardwired interconnectivity Centralised storage Programmable computation Hardwired interconnectivity Partially distributed storage Programmable computation Programmable interconnectivity Fully distributed storage Algorithm on a chip System on a chip Network on a chip
Network on a chip • Generic • Computational resources • Processor cores, FPGA blocks • Storage • Distributed • I/O • Programmable • Interconnect • All resources have an address • Resources are interconnected by a network of switches • Resources communicate by sending addresed packets of data.
Honeycomb Structure: a Possible NOC Topology • Nodes of a honeycomb cell are populated with resources • A switch at centre interconnects resources at nodes • Switches are connected to their immediate neighbours • Each resource is directly connected to three switches and can reach 12 resources with a single hop. • Connectivity is further improved by directly connecting switches to their next nearest neighbour.
The Performance Overhead • Pipelined Interconnect • Wire delays will soon require pipelining wires(Berkeley) • In .1 micron long wire delay will be 100x compared to gate delay. • Signals will need 10s of clock cycles to cross chip. • Switching in NOC provides natural pipelining. • Latency is attenuated • Globally asynchronous & Locally Synchronous design style • Switching with low logic depth can be the high speed clock domain • Computation with high logic depth can be the slow clock domain • Latency is attenuated by the ratio of communication clock to computation clock.
The Area Overhead • Area overhead • A study by Guerrier and Greiner shows that the area overhead will not be an issue. • A more accurate and detailed answer will have to wait further research.
Design Methodology Task graph analysis Scheduling policy Binding of tasks to resources Code generation for tasks Design entry: Set of communicating tasks • How to map an application onto the NOC platform? NOC Compiler
Summary • Future systems on chip will be networks • Fixed platforms will facilitate design • Main open questions: • Network topology? • Network nodes? • NOC Compiler?