370 likes | 444 Views
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs. Overview. Introduction, Goals and Motivation Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits Varying amount of interconnect variation Un/DoPack CAD Tool:
E N D
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs
Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion
Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion
L L L L L L L L L L L L L L L L L L L L L L L L L Mesh-Based FPGA Architecture • 16 logic blocks • 4 wires per channel • 4*4=16 total horizontal tracks • 9 logic blocks • 4 wires per channel • 3*4=12 total horizontal tracks • Larger FPGAs have more “aggregate” interconnect
SIZE of Layout Tile Number of Layout Tiles Motivation: Area of FPGA Devices MCNC Circuits Mapped onto an FPGA Total Layout AREA = SIZE * Number
Interconnect Range User has no choice! Logic Range User buys bigger device. Motivation: Channel Width Demand MCNC Circuits Mapped onto an FPGA Devices built for worst-casechannel width (fixed width) Interconnect dominates area (>70%)
Altera Cyclone • Channel width constraint • of 80 routing tracks • Constrained FPGA • Channel width constraint of 60 routing tracks • Smaller area, lower cost for low-channel-width circuits Goal: Reduce Channel Width But { apex4, elliptic, frisc, ex1010, spla, pdc } are unroutable…. Can we make them routable in a Constrained FPGA?
L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L Possible Solution • Trade-off logic utilization for channel width • User can always buy more logic…. (not more wires) Trade-off: CLB count for Channel width FPGA 1 FPGA 2 What about area??
Features and Costs of Two FPGA Families • Sample Benchmark Circuit • 10,000 LEs • 150 Routing Tracks • No Multipliers • 100 K Memory • Sample Benchmark Circuit • 20,000 LEs • 75 Routing Tracks
Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion
GNL Circuit Benchmark Suite • Create benchmark circuits with variation • SoC <==> Randomly integrate/stitch together “IP Blocks” • IP Blocks have varied interconnect needs • Generate Netlist (GNL) • Stroobandt @ Ghent University • Synthetic benchmark generator • GNL circuits generated hierarchically • Root # I/Os, # IP blocks • Second Level 20 IP blocks, # LEs, Rent parameter
Rent Linear Interpolation • 7 benchmark circuits • Average Rent = 0.62, Stdev Rent = 0 0.12 • 240/120 primary inputs/outputs
Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion
Un/DoPack Flow • Iterative non-uniform cluster depopulation tool • Step 1: Traditional SIS/VPR • Step 2: UnPack: • Congestion Calculator • Step 3: DoPack: • Incremental Re-Cluster • Step 4,5: Fast Place/Route
Un/DoPack Flow: SIS/VPR • Step 1: Traditional SIS/VPR
Un/DoPack Flow: SIS/VPR • Step 1: Traditional SIS/VPR
Un/DoPack Flow: SIS/VPR • Step 1: Traditional SIS/VPR
Un/DoPack Flow: UnPack • Step 2: UnPack: • Congestion Calculator
Un/DoPack Flow: UnPack • Step 2: UnPack • Generate Congestion Map • CLB Label = Largest CW occ in 4 adjacent channels
Un/DoPack Flow: UnPack • Step 2: UnPack: • Depop Center = Largest CLB label M X M Array
Un/DoPack Flow: UnPack • Step 2: UnPack: • Option 1 Coarse Grain: • Dpop Radius = M/4 • Dpop Amt: 1 new row/col in array M X M Array
Un/DoPack Flow: UnPack • Step 2: UnPack: • Option 2 Fine Grain: • Dpop Radius = M/4, M/5, M/6, M/8 • Dpop Amt: 1 new row/col in region M X M Array
Un/DoPack Flow: DoPack • Step 3: DoPack: • Incremental Re-Cluster
Un/DoPack Flow: Fast P&R • Step 4,5: Fast Place/Route
Un/DoPack Flow: Fast P&R • Step 4,5: Fast Place/Route • Fast Placement • UBC Incremental Placer(under development) • VPR –fast • Fast Router • Use illegal pathfinder solution from first iterations • Unsuccessful so far • Use full routed solution • Slow but reliable
Overview • Introduction, Goals and Motivation • Reduce channel width, lower cost, make circuits “routable” • Benchmark Circuits • Varying amount of interconnect variation • Un/DoPack CAD Tool: • Iterative channel width reduction by whitespace insertion • Results • Conclusion
Un/DoPack: Baseline Flow • UnPack: Coarse grained congestion calculator • DoPack: iRAC replica • Fast Place: UBC Incremental Placer • Fast Route: None • FPGA Architecture: • LUT size (k) = 6 • Cluster size (N) = 16 • Inputs per cluster (I) = 51 • Wires of length (L) = 4
Interconnect Variation: Impact on FPGA Architecture Design High VariationCircuits RequireWide Channel Width
Un/DoPack Congestion Map Before After Un/DoPack
Depopulate multiple regions at once Depopulate each region separately Smaller radius = M/10 Handle overlapping regions Multi-Region Un-Pack
Conclusion • Un/DoPack: FPGA CAD flow • Find “local” congestion depopulate reduced interconnect demand • FPGA benchmark circuit “suite” • Stdev: Used to vary interconnect demand • Discoveries… • “Non-uniform” depopulation limits area inflation • “Interconnect variation” important for area inflation and FPGA architecture design • “Routing closure” achieved by re-clustering and incremental place & route • UNROUTABLE circuits made ROUTABLE buy an FPGA with MORE LOGIC!!!