190 likes | 347 Views
Logic Synthesis. Timing Optimization. Restructuring for Timing Optimization. Outline: Definitions and problem statement Overview of techniques (motivated by adders) Tree height reduction (THR) Generalized bypass transform (GBX) Generalized select transform (GST) Partial collapsing .
E N D
Logic Synthesis Timing Optimization
Restructuring for Timing Optimization Outline: • Definitions and problem statement • Overview of techniques (motivated by adders) • Tree height reduction (THR) • Generalized bypass transform (GBX) • Generalized select transform (GST) • Partial collapsing
Timing Optimization Factors determining delay of circuit: • Underlying circuit technology • Circuit type (e.g. domino, static CMOS, etc.) • Gate type • Gate size • Logical structure of circuit • Length of computation paths • False paths • Buffering • Parasitics • Wire loads • Layout
Problem Statement Given: • Initial circuit function description • Library of primitive functions • Performance constraints (arrival/required times) Generate: an implementation of the circuit using the primitive functions, such that: • performance constraints are met • circuit area is minimized
Current Design Process Behavioral description Behavior Optimization (scheduling) Logic and latches Partitioning (retiming) Logic equations • Logic synthesis • Technology independent • Technology mapping • Gate library • Perf. Constraints • Delay models Gate netlist Timing driven place and route Layout
Technology Mapping for Delay Function tree Buffer tree
Overview of Solutions for Delay • Circuit re-structuring • Rescheduling operations to reduce time of computation • Implementation of function trees (technology mapping) • Selection of gates from library • Minimum delay (load independent model - Kukimoto) • Minimize delay and area (Jongeneel, DAC’00) (combines Lehman-Watanabe and Kukimoto) • Implementation of buffer trees • Touati (LT-trees) • Singh • Resizing • Constant delay synthesis
Circuit Restructuring Approaches: Local: • Mimic optimization techniques in adders • Carry lookahead (THR tree height reduction) • Conditional sum (GST transformation) • Carry bypass (GBX transformation) Global: • Reduce depth of entire circuit • Partial collapsing • Boolean simplification
Restructuring Methods Performance measured by • levels, • sensitizable paths, • technology dependent delays • Level based optimizations: • Tree height reduction (Singh ‘88) • Partial collapsing and simplification (Touati ‘91) • Generalized select transform (Berman ‘90) • Sensitizable paths • Generalized bypass transform (McGeer ‘91)
Tree-Height Reduction (THR) Singh’88: 6 n’ Collapsed Critical region 5 n Critical region 5 5 Duplicated logic 1 l m m 1 1 1 4 1 k 2 4 k 0 0 i j i j 3 3 h h 0 0 0 0 0 0 2 0 0 0 0 0 0 2 a b c d e f g a b c d e f g
Tree-Height Reduction 4 New delay = 5 n’ 3 n’ Collapsed Critical region 5 5 2 Duplicated logic 1 m m 1 1 1 1 1 1 2 4 2 4 k k 0 0 0 i j i j 3 3 0 h h 0 0 0 0 0 0 0 0 2 0 0 0 0 2 a b c d e f g a b c d e f g
Generalized bypass transform (GBX) • Make critical path false • Speed up the circuit • Bypass logic of critical path(s) McGeer’91: fm=f … fm+1 fn=g fm =f … fm+1 fn=g 0 g’ 1 dg __ df Boolean difference s-a-0 redundant
GBX and KMS transform GBX gives little area increase, BUT creates an untestable fault (on control input to multiplexer) KMS transform:(remove false paths without increasing delay) • fk is last node on false path that fans out. • Duplicate false path {f1,…, fk} -> {f’1, … , f’k} • f’j fans out to every fanout of fjexcept fj+1, and fj just fans out to fj+1 • Set f0 input to f1 to controlling value and propagate constant (can do because path is false and does not fanout) KMS results • Function of every node, except f1, … ,fk is unchanged • Added k nodes • Area added in linear in size of length of false paths; in practice small area increase.
KMS Keutzer, Malik, Saldanha’90: fm+1 fm+k+1 fm+2 fm+k fn … Delay is not increased f’m+1 f’m+2 f’m+k … fm+1 fm+k+1 fm+2 fm+k fn … 0
Generalized select transform (GST) Berman’90: Late signal feeds multiplexor a out b c d e f g a=0 0 b out c d e f g a=1 1 b a c d e f g
GST vs GBX a c g h 0 … g’ b 1 a GBX a c dh __ da g GBX h 0 … g’ b 1 a a=0 b c d e f g a=1 b c d e f g a=0 out 0 b GST c d e f g 1 a=1 b c d e f g a
GST vs GBX • Select transform appears to be more area efficient • But Boolean difference generally more efficiently formed in practice • No delay/speedup advantage for either transform • Can reuse parts of the critical paths for multiple fanouts on GST out2 GST 0 1 a out1 0 a=0 b c d e f g 1 a=1 b c d e f g a
Technology Independent Delay Reductions Generally THR, GBX, GST (critical path based methods) work OK, • but very greedy and computationally expensive Why are technology independent delay reductions hard? Lack of fast and accurate delay models • # levels, fast but crude • # levels + correction term (fanout, wires,… ): a little better, but still crude (what coefficients to use?) • Technology mapped: reasonable, but very slow • Place and route: better but extremely slow • Silicon: best, but infeasibly slow (except for FPGAs) s l o w e r b e t t e r
Conclusions • Variety of methods for delay optimization • No single technique dominates • When applied to ripple-carry adder get • Carry-lookahead adder (THR) • Carry-bypass adder (GBX) • Carry-select adder (GST) • Clustering/Partial collapse • All techniques ignore false pathswhen assessing the delay and critical regions • Can use KMS transform to eliminate false paths without increasing delay (Caveat: potentially large increase in area)