700 likes | 988 Views
ECE 551 Digital System Design & Synthesis. Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options. Of course, things are not so simply divided. Pre-Synthesis Steps. Syntax Check Makes sure your HDL code follows the syntax rules of the Standard.
E N D
ECE 551Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options
Pre-Synthesis Steps • Syntax Check • Makes sure your HDL code follows the syntax rules of the Standard. • Finds errors like typos, missing semicolons, “begin” without “end”, assigning to a net in a behavioral block, etc. • Only a surface-level check • Checks each module in isolation; doesn’t look at how they fit together
Pre-Synthesis Steps • Elaboration • “Elaborates” HDL statements • Unrolls FOR loops • Computes values of constant functions • Replaces parameters with their values • Substitutes macro text • Evaluates generate conditionals and loops • Checks to make sure instantiated modules are defined • Checks inter-module connections for mismatched input/output connections (i.e. module port width not the same as connected net/variable width)
Pre-Synthesis Steps • Design Check • Checks design for issues that may make it unsynthesizable, but are otherwise legal HDL • Detects multiple drivers to non-tristates • Detects combinational loops • Gives errors or warnings about unsynthesizable constructs like delays, unsupported operators, etc. • Warns about unconnected or constant-value ports • May give warnings about inferred latches • Many of these produce warnings rather than errors; make sure you read the warnings when synthesizing!
Synthesis Process • Inputs • Functional hardware description in HDL • List of design constraints and design rules • Desired clock frequency / maximum delay • Limits on area, power, capacitance • Technology library (logic cells, wire models, etc.) • User-specified synthesis options/strategies • Output • Ideally: A netlist that uses the specified technology library, produces the same behavior as the functional description, and meets the design constraints • Reports that summarize the area and timing of the implementation
Logic Synthesis Steps • Translation • The synthesis tool identifies the behavior of high-level constructs and replaces them with a structural representation from a generic technology library. • Examples: “adder”, “multiplier”, “flip-flop”, “latch” • High-Level Optimizations • The tool performs optimizations at the Boolean equation level • The types of optimizations depend on your strategies • Examples: Reducing the number of logic levels, minimizing the number of Boolean operations, eliminating redundant computations
Logic Synthesis Steps • Mapping • The synthesis tool replaces the generic representations of gates and logic structures with equivalent hardware representations from the provided technology library • The netlist now consists of a structural representation of logic cells (Standard Cell) or LUTs/CLBs (FPGA) • Low-Level Optimizations • The tool performs optimizations at the logic cell level, either to reduce delay or reduce area • Examples: Duplicating logic, re-ordering operations to minimize delay, re-timing registers
A Brief Aside on Mapping • People commonly say that when using Structural Verilog, you know exactly what gates you are getting. • Is this true? • It actually depends on what’s in your Tech Library • If your library contains an XOR gate, then an XOR primitive will be mapped to that gate • But what if your Tech Library only contains NAND gates? Or only Look-up Tables?
Why require Constraints & Strategies? • Synthesis is hard (NP-hard!) • For a circuit of any useful size, the number of possible implementations is enormous • It is too computationally intensive to try them all • Need to know when a solution is good enough to stop • We usually give the tool hints on how to proceed • Often there is no universally “best” solution • Area vs. delay • Throughput vs. latency • Power vs. frequency • Constraints & strategies allow us to manage tradeoffs to find the solution that meets our needs
Constraint Examples • Minimize area module mac(input clk, rst, input [31:0] in, output [63:0] out); reg [31:0] constreg; reg [63:0] mult, add, result; reg [2:0] count; assign out = result; always @(*) mult = constreg * in; always @(*) add = mult + result; always @ (posedge clk) begin if (rst) begin constreg <= in; result <= 0; count <= 0; end else if (count > 0) begin result <= add; count <= count - 1; end else begin result <= 0; count <= 4; end end endmodule
Setting Design Constraints • set_max_area 20000 • Sets maximum area to 20,000 cell units • set_max_delay 4 -to all_outputs() • Sets maximum delay of 4 to any output • set_max_dynamic_power 10mW • Sets maximum dynamic power to 10 mW • create_clk “clk” –period 10 • Specifies that port clk is a clock with a period of 10ns • create_clk –name “my_clk” –period 12 • Creates a virtual clock called my_clk with a period of 12ns; use with combinational logic
CLK_PERIOD = 4 (250 MHz) MAX_AREA = 80000 Arrival: 3.73 Slack: 0.01 Area: 68122 Slack = CLK_PERIOD – (Arrival + Library Setup Time) Library Setup Time is approximately 0.25-0.26 ns for these examples Constraint Examples
CLK_PERIOD = 4 MAX_AREA = 65000 Arrival: 3.75 Slack: 0.00 Area: 64758 Constraint Examples
CLK_PERIOD = 4 MAX_AREA = 60000 Arrival: 3.75 Slack: 0.00 Area: 63377 Constraint Examples
Constraint Examples • Maximize speed module mac(input clk, rst, input [31:0] in, output [63:0] out); reg [31:0] constreg; reg [63:0] mult, add, result; reg [2:0] count; assign out = result; always @(*) mult = constreg * in; always @(*) add = mult + result; always @ (posedge clk) begin if (rst) begin constreg <= in; result <= 0; count <= 0; end else if (count > 0) begin result <= add; count <= count - 1; end else begin result <= 0; count <= 4; end end endmodule
CLK_PERIOD = 4 (250 MHz) MAX_AREA = 80000 Arrival: 3.73 (+ 0.26 = 3.99) Slack: 0.01 Area: 68122 Constraint Examples
CLK_PERIOD = 3.6 (278 MHz) MAX_AREA = 80000 Arrival: 3.46 (+ 0.26 = 3.68) Slack: -0.08 Area: 73131 Constraint Examples
CLK_PERIOD = 3.7 (270 MHz) MAX_AREA = 90000 Arrival: 3.45 (+ 0.25 = 3.7) Slack: 0.00 Area: 75673 Constraint Examples
Optimization Priorities • Design rules have priority over timing goals • Timing goals have priority over area goals • Design rules have highest priority • To prioritize area constraints: • use the ignore_tns (total negative slack) option when you specify the area constraint: set_max_area -ignore_tns 10000 • To change priorities use set_cost_priority • Example: set_cost_priority -delay • To remove all optimization constraints use remove_constraint
Compiling the Design • Once optimizations specifications are set, the design is compiled • The compile command • Logic-level and gate-level synthesis • Optimizations of the design • The compile_ultra command • Two-pass high effort compile of the design • May want to compile normally first to get ballpark figure (higher effort == longer compilation) What is the purpose of doing multiple passes?
Synthesis Strategies • Even after supplying HDL code, Tech Library, and Constraints, the designer is still responsible for the Synthesis Strategy. • Why do we use Strategies? • The amount of CPU time and memory we devote to synthesis are still limited resources • The designers may already have a good idea about what sort of hardware they want
Compiling the Design • Useful compile options include: -map_effort low | medium | high (default is medium) -area_effort low | medium | high (default same as map_effort) -incremental_mapping (may improve already-mapped) -verify (compares initial and synthesized designs) -ungroup_all (collapses all levels of design hierarchy)
Top-Down Compilation • Use top-down compile strategy used when compile time or synthesizer memory are not limiters • Synthesizes each design unit separately and uses top-level constraints • Basic steps are: • Read in the entire design using analyze/elaborate or: acs_read_hdl -recurse $TOP_DESIGN • Resolve multiple instances of any design references with uniquify • Apply attributes and constraints to the top level • Compile the design using compile or compile_ultra
Example Top-Down Script # read in the entire design analyze -library WORK -format verilog {E.v D.v C.v B.v A.v TOP.v} elaborate {E.v D.v C.v B.v A.v TOP.v} current_design TOP link # links TOP.v to libraries and modules it references # set design constraints set_max_area 2000 # resolve multiple references uniquify # compile the design compile
Bottom-Up Compile Strategy • The bottom-up compile strategy • Compile the subdesigns separately and then incorporate them • Top-level constraints are applied and the design is checked for violations. • Advantages: • Compiles large designs more quickly (divide-and-conquer) • Requires less memory than top-down compile • Disadvantages • Need to develop local constraints as well as global constraints • May need to repeat process several times to meet design goals • Might use if memory or CPU time are limited
Compile-Once-Don’t-Touch Method • The compile-once-don’t-touch method uses the set_dont_touch command to preserve the compiled subdesign current_design top characterize U2/U3 current_design C compile current_design top set_dont_touch {U2/U3 U2/U4} compile • What are advantages and disadvantages?
Resolving Multiple References • In a hierarchical design, subdesigns are often referenced by more than one cell instance
Uniquify Method • The uniquify command creates a uniquely named copy of the design for each instance. current_design top uniquify compile • Each design optimized separately • What are advantages and disadvantages?
Ungroup Method (“Flattening”) • The ungroup command makes unique copies of the design and removes levels of the hierarchy current_design B ungroup {U3 U4} current_design top compile • What are advantages and disadvantages?
Benefits of Ungrouping Hierarchy module logic1(input a, c, e, output reg x); always @(a, c, e) x = ((~a|~c) & e) | (a&c); endmodule module logic2(input a, b, c, d, output reg y); always @(a, b, c, d) y = ((((~a|~c)&b) | ((a|~b)&c))&d) | ((a|~b)&~d); endmodule module logic(input a, b, c, d, e, f, output reg z); wire x, y; logic1(a, c, e, x); logic2(a, b, c, d, y); always @(x, y, f) z = (~f&x) | (f&y); endmodule With Hierarchy Area: 36.15 Delay: 0.25 Without Hierarchy Area: 34.15 Delay: 0.25
Ungrouping versus Boolean Flattening • Ungrouping is commonly referred to as “Flattening the Hierarchy”, even by tool vendors • Because of this, many people incorrectly think the “set_flatten true” option in Synopsys is the same as “ungroup” • set_flatten true tells Design Vision to flatten the Boolean equations describing your logic down to a two-level expression. That is, to create a Sum of Products expression. • Flattening Boolean equations is a way of reducing delay at the cost of increased area – we’ll talk about it more in a later lecture.
Dealing with Structured Logic • Sometimes we do not want the synthesis tool to try to optimize our Boolean equations. • Structured Logic refers to Boolean logic operations that are structured in a certain way to achieve a goal, such as reduced delay or fault tolerance. • Examples: Carry-Lookahead Adder, Wallace Multiplier, duplicated logic • set_structure true (default) – tells the tool it can re-order, factor, or decompose the logic equations • set_structure false – tells the tool to leave the logic alone
Checking your Design • Use the check_design command to verify design consistency. • Usually run both before and after compiling a design • Gives a list of warning and error messages • Errors will cause compiles to fail • Warnings indicate a problem with the current design • Try to fix all of these, since later they can lead to problems • Use check_design –summary or check_design -no_warnings to limit the number of warnings given • Use check_timing to locate potential timing problems
Analyzing your Design [1] • There are several commands to analyze your design • report_design • display characteristics of the current design • operating conditions, wire load model, output delays, etc. • parameters used by the design • report_area • displays area information for the current design • number of nets, ports, cells, references • area of combinational logic, non-combinational, interconnect, total
Analyzing Your Design [2] • report_hierarchy • displays the reference hierarchy of the current design • tells modules/cells used and the libraries they come from • report_timing • reports timing information about the design • default shows one worst case delay path • report_resources • Lists the resources and datapath blocks used by the current design • Can send reports to files • report_resources > cmult_resources.rpt • Lots of other report commands available
Synthesis Scripts • Synthesis scripts provide a convenient method for performing synthesis multiple times • To run the script, enter the directory which contains the Verilog code and type: • dc_shell –tcl_mode –f script.tcl • dc_shell –tcl_mode –f script.tcl > log.txt & • This will start the script and store its output to log.txt 43
Example Synthesis Script analyze -library WORK -format verilog {/.register_file_behave.v} elaborate reg_file_behave -architecture verilog -library WORK create_clock –name "clk" -period 2 -waveform {0 1} {clk} set_dont_touch_network [ find clock clk ] set_max_area 30000 check_design uniquify compile -map_effort medium report_area > area_report.txt report_timing > timing_report.txt report_constraint -all_violators > violator_report.txt 44
Design Optimization: FIR Filter • Used in signal processing • Passes through some data but not all (filter!) • Example: Remove noise from image/sound • Uses multipliers and adders • Multiply constant “tap” value against time-delayed input value • In the Verilog, y is out, bk is taps, and x is data
Design Optimization: FIR Filter • We’ll look at three different approaches to implementing this filter • “Initial” • “Small” • “Fast” • We’ll revisit the idea of re-architecting algorithms for better area, latency, and throughput later. • As an exercise, you should take some time on your own to try to understand exactly what is happening in each of the following code segments. • Learning to read and understand someone else’s (confusing) code is an extremely valuable skill
Initial Design: Code [1] module fir_init(clk, rst, in, out); parameter bitwidth = 8; parameter ntaps = 4; parameter logntaps = 2; input clk, rst; input [bitwidth-1:0] in; output reg [bitwidth-1:0] out; reg [bitwidth-1:0] taps [0:ntaps-1]; reg [bitwidth-1:0] data [0:ntaps-1]; reg [logntaps:0] count; integer i;
Initial Design: Code [2] always @(posedge clk) begin if (rst) begin // indicate we need to load all the tap values count <= 0; // reset the data and taps for (i = 0; i < ntaps; i = i + 1) begin: resetloop data[i] <= 0; taps[i] <= 0; end end else if (count < ntaps) begin // we need to load the tap values before filtering for (i = ntaps-1; i > 0; i = i - 1) begin: loadtaps taps[i] <= taps[i-1]; end // load the new value at tap[0] taps[0] <= in; count <= count+1; end
Initial Design: Code [3] else begin // ready to do the filtering // first shift in the new input data value for (i = ntaps-1; i > 0; i = i - 1) begin: shiftdata data[i] <= data[i-1]; end // load the new value at data[0] data[0] <= in; end // else: !if(count < ntaps) end // always @ (posedge clk) // compute the filtered result always @(*) begin out = 0; for (i = 0; i < ntaps; i = i + 1) begin: filterloop out = out + (data[i] * taps[ntaps-1 - i]); end end endmodule