1 / 37

Overview

Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs. Overview. Introduction, Goals and Motivation Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits Varying amount of interconnect variation Un/DoPack CAD Tool:

finian
Download Presentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs

  2. Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion

  3. Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion

  4. L L L L L L L L L L L L L L L L L L L L L L L L L Mesh-Based FPGA Architecture • 16 logic blocks • 4 wires per channel • 4*4=16 total horizontal tracks • 9 logic blocks • 4 wires per channel • 3*4=12 total horizontal tracks • Larger FPGAs have more “aggregate” interconnect

  5. SIZE of Layout Tile Number of Layout Tiles Motivation: Area of FPGA Devices MCNC Circuits Mapped onto an FPGA Total Layout AREA = SIZE * Number

  6. Interconnect Range User has no choice! Logic Range User buys bigger device. Motivation: Channel Width Demand MCNC Circuits Mapped onto an FPGA Devices built for worst-casechannel width (fixed width) Interconnect dominates area (>70%)

  7. Altera Cyclone • Channel width constraint • of 80 routing tracks • Constrained FPGA • Channel width constraint of 60 routing tracks • Smaller area, lower cost for low-channel-width circuits Goal: Reduce Channel Width But { apex4, elliptic, frisc, ex1010, spla, pdc } are unroutable…. Can we make them routable in a Constrained FPGA?

  8. L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L Possible Solution • Trade-off logic utilization for channel width • User can always buy more logic…. (not more wires) Trade-off: CLB count for Channel width FPGA 1 FPGA 2 What about area??

  9. Features and Costs of Two FPGA Families • Sample Benchmark Circuit • 10,000 LEs • 150 Routing Tracks • No Multipliers • 100 K Memory • Sample Benchmark Circuit • 20,000 LEs • 75 Routing Tracks

  10. Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion

  11. GNL Circuit Benchmark Suite • Create benchmark circuits with variation • SoC <==> Randomly integrate/stitch together “IP Blocks” • IP Blocks have varied interconnect needs • Generate Netlist (GNL) • Stroobandt @ Ghent University • Synthetic benchmark generator • GNL circuits generated hierarchically • Root  # I/Os, # IP blocks • Second Level  20 IP blocks, # LEs, Rent parameter

  12. Rent Linear Interpolation • 7 benchmark circuits • Average Rent = 0.62, Stdev Rent = 0  0.12 • 240/120 primary inputs/outputs

  13. Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion

  14. Un/DoPack Flow • Iterative non-uniform cluster depopulation tool • Step 1: Traditional SIS/VPR • Step 2: UnPack: • Congestion Calculator • Step 3: DoPack: • Incremental Re-Cluster • Step 4,5: Fast Place/Route

  15. Un/DoPack Flow: SIS/VPR • Step 1: Traditional SIS/VPR

  16. Un/DoPack Flow: SIS/VPR • Step 1: Traditional SIS/VPR

  17. Un/DoPack Flow: SIS/VPR • Step 1: Traditional SIS/VPR

  18. Un/DoPack Flow: UnPack • Step 2: UnPack: • Congestion Calculator

  19. Un/DoPack Flow: UnPack • Step 2: UnPack • Generate Congestion Map • CLB Label = Largest CW occ in 4 adjacent channels

  20. Un/DoPack Flow: UnPack • Step 2: UnPack: • Depop Center = Largest CLB label M X M Array

  21. Un/DoPack Flow: UnPack • Step 2: UnPack: • Option 1 Coarse Grain: • Dpop Radius = M/4 • Dpop Amt: 1 new row/col in array M X M Array

  22. Un/DoPack Flow: UnPack • Step 2: UnPack: • Option 2 Fine Grain: • Dpop Radius = M/4, M/5, M/6, M/8 • Dpop Amt: 1 new row/col in region M X M Array

  23. Un/DoPack Flow: DoPack • Step 3: DoPack: • Incremental Re-Cluster

  24. Un/DoPack Flow: Fast P&R • Step 4,5: Fast Place/Route

  25. Un/DoPack Flow: Fast P&R • Step 4,5: Fast Place/Route • Fast Placement • UBC Incremental Placer(under development) • VPR –fast • Fast Router • Use illegal pathfinder solution from first iterations • Unsuccessful so far • Use full routed solution • Slow but reliable

  26. Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion

  27. Un/DoPack: Baseline Flow • UnPack: Coarse grained congestion calculator • DoPack: iRAC replica • Fast Place: UBC Incremental Placer • Fast Route: None • FPGA Architecture: • LUT size (k) = 6 • Cluster size (N) = 16 • Inputs per cluster (I) = 51 • Wires of length (L) = 4

  28. Area of GNL Benchmarks

  29. Interconnect Variation: Impact on FPGA Architecture Design High VariationCircuits RequireWide Channel Width

  30. Critical Path of GNL Benchmarks

  31. Un/DoPack Congestion Map Before After Un/DoPack

  32. Depopulate multiple regions at once Depopulate each region separately Smaller radius = M/10 Handle overlapping regions Multi-Region Un-Pack

  33. Normalized Area

  34. Normalized Critical Path

  35. Run-Time Comparisons

  36. Conclusion • Un/DoPack: FPGA CAD flow • Find “local” congestion  depopulate  reduced interconnect demand • FPGA benchmark circuit “suite” • Stdev: Used to vary interconnect demand • Discoveries… • “Non-uniform” depopulation limits area inflation • “Interconnect variation” important for area inflation and FPGA architecture design • “Routing closure” achieved by re-clustering and incremental place & route • UNROUTABLE circuits made ROUTABLE buy an FPGA with MORE LOGIC!!!

  37. End of Talk

More Related