1 / 21

Energy- and Performance-Aware Mapping for Regular NoC Architectures

This paper explores routing and mapping solutions for Network-on-Chip architectures to minimize energy consumption and maximize performance. Detailed analysis of routers, platform, energy models, and mapping methodologies are discussed and evaluated using various algorithms and heuristics.

crouchk
Download Presentation

Energy- and Performance-Aware Mapping for Regular NoC Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Energy- and Performance-Aware Mapping for Regular NoC Architectures Jingcao Hu and Radu Marculescu Carnegie Mellon University

  2. Introduction • Integration levels allow system-on-chip design • Network-on-chip offers design automation and high performance • Structured network wiring • Modularity (floorplan) • Standard network interfaces • Assume regular tile-based approach • Assume task-graph architecture where #nodes = #tiles communication paths are annotated with traffic load

  3. Introduction • This paper examines two problems: • Routing • Use standard routing policies to avoid deadlock • Encode deterministic, predefined route for each S/D pair into unique routing table at each router • Mapping • Find good policy for mapping IP cores to tiles • Objective: Minimize energy, maximize performance

  4. Routers • Buffer space • Use small registers (1-2 flits/input port) • Low area requirement, low decoding latency • Routing type • Wormhole routing • Takes advantage of small buffer space • Routing policy • Use deterministic (vs. adaptive) • Simple routing logic • In-order packet arrival • Traffic is minimal and predictable, which allows optimization by placement • Built-in deadlock avoidance

  5. Platform • n x n grid of tiles • Each router has routing table and 5x5 crossbar • Energy model: • Ebit = ESbit + EBbit + EWbit + ELbit • EBbit and EWbit is small compared to other terms… • Ebit = ESbit + ELbit • One bit to ti to tj: • Ebit(ti, tj) = nhops x ESbit + (nhops - 1) x ELbit • Minimal routing : nhops – 1 is Manhattan distance

  6. Obligatory Theory Stuff • Application Characterization Graph (APCG) • Vertexes are IPs • Arcs define communication between IPs and contain information about data volume and required bandwidth • Architecture Characterization Graph (ARCG) • Vertexes are tiles and are fully connected • Arcs contain routings information • Candidate minimal paths (set of links) • Energy requirement • Routes chosen according to XY routing • Problem: find mapping such that energy is minimized

  7. Sanity Check • TGFF used to generate series of task graphs for 3 x 3 to 13 x 13 grids • There are n! possible mappings • Finding optimal mapping is constrained quadratic assignment problem (NP-hard) • Generate 3000 random mappings of IPs • From these, choose best energy and median energy reqm’t • Use SA to search for best mapping

  8. Mapping Search Classic search tree: need state representation, operators, utility function, queuing function, and trim function (paths required by TG)

  9. Search Heuristic • Cost of a node = energy consumed by all nodes which have been mapped • Upper bound cost = no less than minimum cost of descendent leaf nodes • i.e. From this map state, we can do AT LEAST this good • Lower bound cost = lowest cost possible for descendent leaf nodes • i.e. From this map state, we can do AT BEST this good • Algorithm: • Unexpanded node is selected • Next unassigned IP is assigned to each open tile • PAT is computed for each child node • Trim nodes whose cost or LBC > lowest UBC that has been found • How to compute routing paths and UBC/LBC for each node? • Better routing path allocation leads to better results • Tighter UBC/LBC assist in trimming away bad nodes but requires more time

  10. Routing Path Allocation fully adaptive, 8 turns XY routing, 4 turns • Odd-even, 6 turns • Even column: • no EN or ES turn • Odd column: • no NW or SW turn west-first, 6 turns No NW or SW turns

  11. Routing Path Allocation • Find list of communication loads (LCL) • LCL is list of datapaths in task graph exposed by assigning an IP to a tile • Flexibility of a CL is defined as number of possible minimum paths through network (based on routing policy) • LCL is sorted from least flexible to most flexible • choose_link() returns least loaded link allowed by routing policy

  12. UBC Calculation • Compute UBC by for each node by greedily mapping the remaining unmapped IPs • Next unmapped IP with highest communication demand is selected • Ideal location is calculated: • IP is mapped to closest open tile to x,y (Manhattan distance) • Performed until all IPs are mapped • Cost of this leaf node is UBC

  13. LBC Calculation cost amonst mapped to unmapped IPs cost amonst mapped IPs cost amonst unmapped IPs cost of route Unmapped IPs Unmapped tiles Mapped IPs

  14. Pseudocode IPs are sorted by communication demand (in descending order) Priority queue (PQ) sorts nodes to be branched based on cost (in ascending order)

  15. Results

  16. Results (rel. to link bandwidth)

  17. Multimedia Application • Multimedia system (MMS): H263 encoder/decoder, MP3 encoder/decoder • 40 tasks, assign to 16 IPs from Mentor • Use audio/video clips to derive patterns

  18. Mapping Results

  19. Results • Mapping: • 10 x 10 tiles: • EPAM took a few minutes to map • SA didn’t finish after 40 hours • MMS: • EPAM-XY can’t find a solution for <= 324 Mb/s links • EPAM-OE and -WF can find solution down to 307 Mb/s • Energy requirement: • OE = WF < XY

  20. Irregular Regions • Irregular region sizes • Divide regions into dummy IPs • Assign high weights for dummy IPs in APCG • Mapping will place these IPs in adjacent tiles

  21. Irregular Regions

More Related