1 / 46

FPGA Technology Mapping

FPGA Technology Mapping. Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223. Example. Theoretical Results. Minimize Number of Logic Stages Polynomial-time Minimize Total Number of LUTs (Area) NP-Complete Minimize Power Consumption

alida
Download Presentation

FPGA Technology Mapping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FPGA Technology Mapping Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223

  2. Example

  3. Theoretical Results • Minimize Number of Logic Stages • Polynomial-time • Minimize Total Number of LUTs (Area) • NP-Complete • Minimize Power Consumption • NP-Complete

  4. DAG Representation • Since LUTs are reconfigurable, we don’t need to worry about the logic function of each gate during mapping

  5. K-Feasible Cuts and LUT Mapping

  6. Example: Node Duplication

  7. FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs Jason Cong and Yuzheng Ding IEEE Trans. CAD 13(1): 1-12, Jan. 1994

  8. Cuts in a Directed Acyclic Graph (DAG) A cut is K-feasibleif: This cut is 3-feasible

  9. Edge Cut Size • Each edge has non-negative capacity • The edge cut size is the sum of the capacities of the forward edges that cross the cut • All edge capacities are assumed to be 1

  10. Example: Edge-cut size is 10

  11. Volume and Height • The volume of a cut is the number of vertices in X: • Given an assignment of labels to vertices, the height of a cut is the largest label in X

  12. Example: Volume=9, Height=2

  13. FlowMap Algorithm (Overview) • Labeling Phase • Computes a label for each node reflecting the level of the K-LUT that implements that node in a depth-optimal mapping solution • Mapping Phase • Generates the K-LUT mapping solution based on node labels computed in the first phase

  14. Subnetwork of a Node • For node t, let Nt denote the subnetwork consisting of every vertex s, such that there is a path from s to t

  15. Conversion to a Network Nt • For node t, let Nt denote the subnetwork consisting of every vertex s, such that there is a path from s to t • We can ignore the logic function of each gate

  16. Intuition • Let LUT(t) represent a K-LUT that produces an output at node t • Define a K-feasible cut where • denotes the set of nodes in LUT(t) • denotes the remaining nodes in • K-feasibility is ensured since LUT(t) has < K inputs • If u has the maximum label in , then in the optimal mapping,

  17. Minimizing the Level of LUT(t) • There may be many K-feasible cuts in • Lemma 1. Find the one that minimizes height! • Note: This definition enumerates all K-feasible cuts at t • Key contribution: This can be done in O(Km) time, where m is the number of edges in

  18. Example • You get the existence of the 3-feasible cut in part (c) for free. • Figuring out how to compute it is the hard part!

  19. Lemma 2 • Proof Strategy 1. Prove 2. Prove Consult the paper for details

  20. Algorithmic Strategy • Check if there is a K-feasible cut of height in • If so, pack along with the nodes in in the second phase of the algorithm. • Otherwise, the minimum height among all K-feasible cuts in is is ,, and is one such cut. • If so, use a new K-LUT for in the next phase.

  21. How to efficiently test if has a K-feasible cut of height p – 1? • Let p be the maximum label among all nodes of input(t) • Equivalently, p is the maximum label of all nodes that belong to • Collapse all nodes in with label > p along with t into a single sink t’; call the new network

  22. More Theory • Construct another network from • Details to follow… • has a cut whose edge cut-size is no more than K if the max. flow in is at most K

  23. Example … …

  24. Algorithmic Strategy (Recap) • Check if there is a K-feasible cut of height in • If so, pack along with the nodes in in the second phase of the algorithm. • Otherwise, the minimum height among all K-feasible cuts in is is ,, and is one such cut. • If so, use a new K-LUT for in the next phase.

  25. Labeling Algorithm for K-LUTs • For each node t in the DAG, taken in topological order • Let p be the max. label among all nodes of • Build networks , , and • Compute the maximum flow in • If the maximum flow is less than K, then: • Otherwise

  26. Summary of Theoretical Results

  27. FlowMap Algorithm

  28. Post-processing for Area Reduction

  29. Post-processing for Area Reduction

  30. Post-processing for Area Reduction

  31. WireMap: FPGA Technology Mapping for Improved Routability and Enhanced LUT Merging S. Jang, B. Chan, K. Chung, and A. Mishchenko ACM TRETS 2(2): article #14, June, 2009

  32. And-Inverter Graph (AIG) INV AND AND INV INV INV AND AND AND INV INV https://en.wikipedia.org/wiki/And-inverter_graph

  33. Generic FPGA Technology Mapping

  34. Cut Enumeration • The set of K-feasible cuts for an AND node n with predecessor nodes n1 and n2 • Let A and B be two sets of cuts

  35. Cut Enumeration • Process vertices in topological order to ensure that cut sets for n1 and n2 are known before computing the cut set for n • The CUT set of an AND node is computed by merging the CUT sets of its predecessors and adding the trivial cut (containing just n) while keeping only the K-feasible cuts • Remove dominated cuts • Each AIG node is a 2-input AND

  36. Depth-Oriented Mapping • Keep the node at each level that minimizes depth (e.g., FlowMap)

  37. Area Recovery • Depth minimization may cause area duplication • Multiple cuts cover an AIG node • Increases LUT count • // Area Flow • Global View • Selects cuts with more shared logic • // Exact Local Area • Local View • Minimizes area exactly at each node

  38. Area Flow • Estimates sharing between cuts without the need to (re-)traverse them n Area(n) is the area cost of the LUT that maps node n Leafi is the ith leaf of the cut at n NumFanout(n) is 1 if n is not used in the current mapping for area flow computation Leaf

  39. Local View • The exact local area of the current node is the area added to the mapping by using the current node • Recursively compute the number of LUTs in the max. fanout free cone (MFFC) of the current node • Use a fast local DFS traversal n Recursive Calls

  40. Producing a Mapped Network • Assume one K-feasible representative cut is computed for each node

  41. WireMap • Objective • Reduce the number of LUT-to-LUT connections in addition to area reduction • Rationale • Fewer nets will help the placer to generate a solution with reduced wirelength

  42. Global View Heuristic • Area Flow (from previous slide) Area(n) is the area cost of the LUT that maps node n • Edge Flow (new idea) Edge(n) is the number of fanin edges to the LUT that maps node n

  43. Global Edge/Area Recovery Alg. Find all cuts with min. area Use edge flow as tiebreaker n Leaf No recursion; use the saved edgeflow computed at each predecessor node

  44. Local View • The exact local area (edge count) of the current node is the area (edge count) added to the mapping by using the current node • Recursively compute the number (edge count) of LUTs in the max. fanout free cone (MFFC) of the current node • Use a fast local DFS traversal n Recursive Calls

  45. Local View Algorithm Pointer manipulation in function calls (not shown) • Edge count of a cut depends if the cut is representative of the node in the mapping • If so, reference the node and the leaves of its representative cut Find all cuts that minimize the exact area; use the exact edge count as a tiebreaker

  46. WireMap Algorithm Xilinx Virtex-5 Dual Output LUT

More Related