1 / 29

A Combined Clustering and Placement Algorithm for FPGAs

A Combined Clustering and Placement Algorithm for FPGAs. Mark Yamashita. Contributions. New algorithm to do clustering and placement Novel approach for trading-off depth for duplication control Timing model/placement incorporated into clustering Delay improves by an average of 11%

Download Presentation

A Combined Clustering and Placement Algorithm for FPGAs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Combined Clustering and Placement Algorithm for FPGAs Mark Yamashita

  2. Contributions • New algorithm to do clustering and placement • Novel approach for trading-off depth for duplication control • Timing model/placement incorporated into clustering • Delay improves by an average of 11% • Controllable trade-off between area overhead and delay improvements • Plan to submit to FPL ‘08

  3. Motivation • FPGAs need to be faster • 4x slower than ASICs • Limitations of existing clustering approaches: • No depth control during clustering, often greedy • Provide no means for duplication, or • Use duplication in excess • Inaccurate timing models

  4. Motivation • GOAL: • Improve critical-path delay by improving clustering • Approach: • Use placement information to form accurate timing model • Make better clustering decisions • Use duplication to reduce depth • Take advantage of otherwise unused logic in FPGA • Control amount of duplication by relaxing depth

  5. Algorithm Overview T-VP

  6. Phase 1: Microcluster Formation

  7. Phase 1: Example

  8. Phase 1: Lawler Levitt Turner Algorithm

  9. Phase 1

  10. Phase 1: Node Duplication Reduction

  11. Phase 1: Block Usage Results

  12. Phase 1: Additional Duplication Reduction Through Depth Relaxation

  13. Algorithm Overview T-VP

  14. Phase 2: Microcluster Compaction with Orchestrator • Iteratively move microclusters to improve timing • Can fit multiple microclusters to the same CLB position, provided the aggregate of all microclusters meets CLB constraints • If an area constraint is given, remove duplication and fragmentation until constraint is met

  15. Phase 2: Orchestrator Example

  16. Phase 2: Orchestrator Example

  17. Phase 2: Orchestrator Example

  18. Phase 2: Orchestrator Example

  19. Phase 2: Orchestrator Example

  20. Phase 2: Orchestrator Example

  21. Phase 2: Orchestrator Example

  22. Results: Timing

  23. Results: Area

  24. Results: Timing vs. Area

  25. Results: Timing vs. Depth

  26. Conclusions • Reducing depth contributes to a reduction in critical path delay • Node duplication, when used effectively, reduces critical path delay • Duplication can be used to provide a performance-area tradeoff to the designer

  27. Future Work • Promising Post-Placement Optimizations: • Retiming • Leverage a more significant depth reduction • Logic reintroduction • Create duplication to increase performance

  28. Contributions • New algorithm to do clustering and placement • Novel approach for trading-off depth for duplication control • Timing model/placement incorporated into clustering • Delay improves by an average of 11% • Controllable trade-off between area overhead and delay improvements • Plan to submit to FPL ‘08

  29. Thank You

More Related