150 likes | 300 Views
Topology Optimization for Application-Specific Networks-on-Chip. Tapani Ahonen Tampere University of Technology Institute of Digital and Computer Systems P.O.Box 553, FI-33101 Tampere Phone +358 3 3115 4562 Fax +358 3 3115 3095 Email tapani.ahonen@tut.fi. Outline. Motivation
E N D
Topology Optimization for Application-Specific Networks-on-Chip Tapani Ahonen Tampere University of Technology Institute of Digital and Computer Systems P.O.Box 553, FI-33101 Tampere Phone +358 3 3115 4562 Fax +358 3 3115 3095 Email tapani.ahonen@tut.fi
Outline • Motivation • Platform Design • OIDIPUS – A Network-on-Chip Topology Design Tool • Case Study in Process Control and Monitoring • Evaluation of the Tool
Motivation for a Design Paradigm Shift • system integration to a single chip • complex interconnections • signal coupling • timing closure problems • ... • increasing mask costs • ASICs are vanishing from low-volume markets and frequently updated products • long interconnections are very costly • FPGAs fall behind • chip-level reuse: development platforms for application domains • complicated design process • programmability implies high data traffic • block-level reuse
Platform Design Flow • Specify performance requirements in • task level • application level • system level • statistical execution models for temporal requirements • choose processing elements with known characteristics • map and schedule to minimize communication • optimize NoC layout • patitioning • connectivity • block placement
OIDIPUS - Network-on-Chip Topology Design Tool • layout optimization (based on communication and IP specs) • target: speed and/or power consumption • asynchronous communication • partitioning and block placement • connectivity optimization of the seed topology under constraints on • maximum number of node dimensions and width of a link • reliability • execution at an early stage of design • assumptions with abstract input information • square block layout assumed w/o aspect ratio • node location at center of a side w/o determination • constant throughput • default data activity
OIDIPUS – Design Space Exploration • exhaustive search not feasible with >> 10 hosts • simulated annealing • allows to escape from a local minimum • simulation schedule from effort parameter
OIDIPUS – Partitioning the Network • goal: minimize (the cost of) global communication • main factors of power consumption • communication distance (wire length) • data activity • actual distance unknown at partitioning time • use data activity for cost calculation • (only throughputs contribute with default data activity) • actual latencies are also unknown • use latency tolerance figures for cost calculation case study: F(c) = Nc (Wp + (1/)Wl)(Po-Pe/2) where Nc is the number of communication channels, is throuhput, is latency tolerance, W stands for weight, P stands for partition
OIDIPUS – Block Placement • goal: minimize (the cost of) local communication • assumptions • always through shortest path w/o prbability spec • asynchronous communication • proper repeater insertion => delay proportional to wire length • the longest link (length Lllp) on a path restricts speed • number of hops from origin to target (Hp) determine latency • Clc = (1/ )Hp Lllp • total path length (Lp) dominates power consumption • Cpc = Lp case study: F(c) = Nc ( LpWp + (1/)Hp LllpWl) where Nc is the number of communication channels, is throuhput, is latency tolerance, W stands for weight
Case Study: Industrial Process Monitoring and Automation • remote monitoring and control • event sensing (ADCs) • pattern recognition (DSP) • voice recognition (DSP) • probing (sensors) • network connection / user interface (protocol processor, I/O devices) • reprogrammability for different product runs • reactive appliances • programmable system control (RISC) • motor / process controlling signal generation (controllers & DACs) • maintenance facilitation • event recording (memory)
Evaluating the Tool with a Single Design Goal • case study specification was used as an input to OIDIPUS • simple benchmark algorithm with human-like behavior • prioritizes channels and adds respective blocks to the topology trying to minimize the communication distance • block placement in bi-directional ring only • four different results with 2-5% higher cost than OIDIPUS topologies • placement and partitioning by a human designer • ”intelligent” usage of a memory block as a passive network bridge • eliminates one link, but adds control traffic • reoptimization of the block placement with OIDIPUS resulted in a cost that was 3.2% lower in the other partition and 0.3% lower in the other • partitioning matched to the human design
Evaluating the Tool with a Single Design Goal • gradual change of the design goal demonstrated block placement variation with the objective • design using OIDIPUS from the beginning • 1 link and a network bridge more • more hops per path in average • shorter overall interconnections (15%) • shorter physical path langth (11%) • cost of partitioning ~32% lower, cost of block placement <2% higher • human designer reduced the cost against an average design ~25% • OIDIPUS design had a cost approximately 33% lower than average • benefits grow with network complexity
Evaluating the Tool with a Single Design Goal • OIDIPUS was used without partitioning • 45% longer average path compared to the partitioned design • 65% more hops • results • 38% lowered cost against an average implementation • <27% higher cost against the partitioned design
Conclusion • NoC layouts can be significantly optimized by exploiting application domain specific features • critical wires are effectively shortened • human designers succeed through divide-and-conquer strategy • the higher the level of freedom for an automation tool, the better the optimization results • with freedom in connectivity, small partitions should be avoided unless they are isolated from the rest of the system • This was just the beginning of the journey