190 likes | 305 Views
Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning. Wai-Kei Mak Dept. of Computer Science and Engineering University of South Florida Evangeline F.Y. Young Dept. of Computer Science and Engineering The Chinese University of Hong Kong. Outline.
E N D
Temporal Logic Replication for Dynamically Reconfigurable FPGA Partitioning Wai-Kei Mak Dept. of Computer Science and Engineering University of South Florida Evangeline F.Y. Young Dept. of Computer Science and Engineering The Chinese University of Hong Kong
Outline • Dynamically reconfigurable FPGA • Temporal partitioning = Conventional partitioning? • Temporal logic replication • What? • Why? • How? • Experimental results • Conclusions
Dynamically Reconfigurable FPGA • Store multiple contexts on chip. • Reuse logic blocks and wire segments dynamically. • The contexts stored can correspond to the multiple stages of a large circuit.
Temporal Circuit Partitioning • Temporal partitioning • multiple stages execute sequentially • Spatial partitioning • multiple components execute concurrently
Temporal Logic Replication • Can reduce buffering requirement. • Effectively utilize available slack logic capacity.
Temporal Constraints • For a net n = (v1, {v2, …, vp}), • require s(v1) s(vj), j=2,…,p, if v1 is a combinational node
Temporal Constraints (Cont’d) • require s(vj) s(v1), j=2,…,p, if v1 is a flip-flop node
Temporal Partitioning with Replication Problem: Partition given circuit into pre-defined # stages satisfying all temporal constraints. Objective: Minimize buffers required between stages. Proposal: Utilize available slack logic capacity to reduce signal buffering. Solution: An effective 2-step approach.
2-Step Approach Step 1: Compute a temporal partition w/o replication. Step 2: Repeatedly identify the bottleneck stage and apply replication for that stage.
Advantages of 2-Step Approach • Will not replicate unnecessarily. • All temporal constraints are already satisfied when replicating.
Min-Area Min-Cut Replication Let stage i be the bottleneck stage. Min-Cut Replication • Compute a subset of nodes Riin stage i for replication into stage i+1 to maximally reduce the communication cost at stage i. Min-Area Min-Cut Replication • Compute a minimum subset of nodes Ri in stage i for replication into stage i+1 to maximally reduce the communication cost at stage i.
Optimal Solution for Min-Area Min-Cut Replication Let Vi= set of nodes in stage i. Observation 1: The min-cut replication problem can be solved by computing a minimum cut (Vi-Ri,Ri) in stage i. Observation 2: The min-area min-cut replication problem can be solved by computing a minimum cut (Vi-Ri,Ri) in stage i s.t. |Ri| is minimized.
Example A pre-partition: Computing a minimum cut in stage 2:
Example (Cont’d) • ComputedR2 = {j}
Network Modeling • Need to ensure that cut size = buffer requirement • For a net (v1, {v2, …, vp}),
The Case of Limited Slack Logic Capacity • The solution of min-area min-cut replication suffices if slack logic capacity is sufficiently large. • Otherwise, |Ri| exceeds the slack, then use a heuristic to reduce Ri. • Use a repeated max-flow min-cut heuristic to gradually reduce Ri (so cut size is only increased gradually). • H. Yang, D.F. Wong, “Efficient Network Flow based Min-Cut Balanced Partitioning”, ICCAD’94.
Algorithm Input: Stage area bound A. 1. Network modeling for bottleneck stage i. 2. Compute min-cut (Vi-Ri,Ri) s.t. |Ri| is minimized. 3. If |Vi+1|+|Ri| A, stop and return Ri. 4. Collapse a node in Ri with all nodes in Vi-Ri, goto 2.
Conclusions • Proposed temporal logic replication to reduce buffering requirement in DRFPGA partitioning. • Presented an effective 2-step approach. • Formulated and optimally solved the min-area min-cut replication problem. • Extended to case of limited slack logic capacity. • In the paper, a new timing-driven temporal partitioning algorithm was introduced to compute pre-partition.