290 likes | 554 Views
Low Power Implementation of ARM1176JZF-S . by Manish Kulkarni. Statement. Input RTL of ARM1176JZF-S Implementation process in 65nm CMOS (TSMC) Power Intent Library Information (to be used) Output Enabling RTL for Low Power design This includes enabling clock gating
E N D
Low Power Implementation of ARM1176JZF-S by Manish Kulkarni
Statement • Input • RTL of ARM1176JZF-S • Implementation process in 65nm CMOS (TSMC) • Power Intent • Library Information (to be used) • Output • Enabling RTL for Low Power design • This includes enabling clock gating • Writing Unified Power Format (UPF) from given power intent • Low Power implementation of ARM1176JZF-S for 45nm CMOS (Samsung)
Outline • Basic implementation Flow • Synthesis • Basic Input Requirements • Decision on Power Intent and Creation of UPF • Understanding Intent and creating Power Intent Diagram • Describing this power intent in UPF • Floor Planning • Placement Optimization • Clock Tree Synthesis (CTS) • Route Optimization • LVS and DRC • Static Timing Analysis (Primetime) • Power Analysis (Primetime PX) • Formal Verification • Remaining work
Basic Requirements Tech .tcl script
Power Intent Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDCORE VDDSOC VSS
Power Intent Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDCORE VDDSOC VDDRAM VSS RAMCLAMP VDDSOC VSS CPUCLAMP VSS
Power Intent Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDSOC VDDRAM VDDCORE LS LS VDDRAM VDDSOC VSS VDDRAM VSS VSS VDDCORE VDDSOC RAMCLAMP VDDCORE VSS VDDSOC VSS VSS CPUCLAMP VSS
Power Intent Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDSOC VDDRAM VDDCORE LS LS VDDRAM VDDSOC VSS VDDRAM VSS VSS VDDCORE VDDSOC RAMCLAMP VDDCORE VSS VDDSOC VSS VSS CPUCLAMP VSS
Writing UPF from Power Intent Diagram Power Intent Diagram UPF
Interpreting Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDCORE VDDSOC create_supply_net create_supply_port create_power_domain VSS
Interpreting Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDCORE VDDSOC set_isolation VDDRAM VSS RAMCLAMP set_isolation_control VDDSOC VSS CPUCLAMP VSS
Interpreting Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDSOC VDDRAM VDDCORE LS LS VDDRAM VDDSOC VSS set_level_shifter - rule high_to_low VDDRAM VSS VSS VDDCORE VDDSOC RAMCLAMP VDDCORE VSS set_level_shifter - rule low_to_high VDDSOC VSS VSS CPUCLAMP VSS
Interpreting Diagram VDDRAM VDDCORE VDDSOC VDDRAM VDDSOC VDDRAM VDDCORE LS LS VDDRAM VDDSOC VSS VDDRAM create_pst PST -supplies {VDDCORE VDDRAM VDDSOC} add_pst_statePM_highV -pstPST -state {High HighHigh } add_pst_statePM_medV -pstPST -state {Med Med High } add_pst_statePM_lowV -pstPST -state {Low Low High } add_pst_stateVCORE_dormant -pstPST -state {OFF High High } ` VSS VSS VDDCORE VDDSOC RAMCLAMP VDDCORE VSS VDDSOC VSS VSS CPUCLAMP VSS
Placement Optimization • The cells in the design are placed in the layout to meet the given timing, area and power constraints • It is an iterative process Violations: • High Fan-out Nets Violations • Constraints reports after synthesis reported 2 high-fan-out nets • CPUCLAMP ( fanout : 1684) • RAMCLAMP ( fanout : 48) • These caused many max_trasition violations • These nets were fixed by • compile_clock_tree –high_fanout_net CPUCLAMP • compile_clock_tree –high_fanout_net RAMCLAMP
CTS and ROUTE • Clock Tree Synthesis (CTS) : • routes the clock throughout the design • Inserts buffers in the tree so as to meet max. fan-out and max. transition constraints • The cells placed during place optimizations are not modified • Routing • All the interconnection signals are routed • Buffers may be inserted in order to meet timing constraints • Constraints on the metal layers to be used are specified • Iterative process which takes the longest time in the flow
LVS and DRC • Layout Versus Schematic (LVS) • It is verified weather the layout obtained is same as the schematic specified • The connectivity of the ports and signals is verified as per the schematic • Design Rule check (DRC) • Foundry specifies manufacturing specific design rules • Spacing between 2 metal tracks in same layer etc • Designer has to verify if these rules are being followed properly • Tools like Hercules (Synopsys) and Calibre (Mentor Graphics) can be used • IC Compiler also contains inbuilt tools to check LVS and DRC
Static Timing Analysis • Analysis Performed using Primetime • Post Layout verilog netlist is loaded • Extracted Parasitic are loaded • Timing analysis is performed for only one power state (High voltage) Violations: • Setup and Hold violations were found • Transition violations were found • These violations were fixed by adding buffers in high fan-out nets which are causing these violations • It can also be fixed by increasing the drive strength of cells
Power Analysis • Power Analysis is performed using Primetime PX • Probabilistic logic activity was used for power measurement and the probability was set to 0.25 • Analysis was done only for first power state i.e. the high voltage mode
Formal Verification • Tool used is Formality (Synopsys) • proves or disproves functional equivalence of two designs • In this case, functionality is verified between the pre-layout gate level netlist (or verified(golden) RTL) and post-layout gate level netlist • Uses static techniques which do not require vector inputs • Uses existing Synopsys Design Compiler technology libraries
Remaining Work • The characterization was done for only High Power state • Similar characterization can be done for the other Power states