200 likes | 208 Views
High-Level Modeling. General-Purpose Languages, High-Level Synthesis John Sanguinetti. Gate-level Design. Gate-level design by schematics Gate-level verification in netlist simulators Architecture moves up to Verilog. ARCHITECTURE. Verilog. 1982. Gate-Level. Gate Sim Verilog.
E N D
High-Level Modeling General-Purpose Languages, High-Level SynthesisJohn Sanguinetti
Gate-level Design • Gate-level design by schematics • Gate-level verification in netlist simulators • Architecture movesup to Verilog ARCHITECTURE Verilog 1982 Gate-Level Gate Sim Verilog Schematic IMPLEMENTATION VERIFICATION
Register-transfer Level • Logic synthesis enabled more abstract design • Verilog architectural languageused for RTL design • Architecture movesup to C++ ARCHITECTURE C++ 1992 Logic Synthesis RTL Verilog C++ Verilog IMPLEMENTATION VERIFICATION Verilog 1982 Gate-Level Gate Sim Verilog Schematic
“Higher-level” • Current architecturelanguage (C++) willemerge as next design language • Practical high-level synthesis in C++ will trigger the change ARCHITECTURE 2002 HighLevel High-level Synthesis C++ C++ IMPLEMENTATION VERIFICATION C++ 1992 RTL Verilog C++ Verilog Verilog 1982 Gate-Level Gate Sim Verilog Schematic
The Problem: Lack of Tools • Starting point is GPL (C++) • Entry point to backend is HDL/RTL • Refinement is manual • Only GPL users: • academics • lunatic fringe Algorithm C++ High-level Model Architecture Modulardecomposition Structural elaboration Paper Spec HDLIP RTL Cycle timing RTLImplementationModel Resource allocation Gate Logic synthesis Gates
GPL candidates • SpecC • HandleC • Java • C++/Cynlib • C++/SystemC • Extended SystemC • We’ve made good progress
HLS: The Promise • High-level Synthesis • Enables higher levels of design abstraction • Connects the starting point with the ending point • Allows architectural exploration • Eases technology process migration • Achieves better results with less effort • Enables faster simulation and design debugging at the behavioral level
HLS: The Experience • Behavioral synthesis was not successful • QOR marginal • Hard to use, non-intuitive • Results nearly impossible to verify • Poisoned the market • What went wrong? • Started with the wrong input • Point tool solution for a design flow problem
HLS: The Future is Now … • High-level Synthesis • We have the right starting point • We can use a common test bench • We can keep the interfaces constant • We can produce RTL which meets timing constraints
CynthHL Design Flow • Automatic generation of verifiable RTL from architectural C++ • Single verification environment for entire design flow Algorithm Protocol Algorithm Architecture Modulardecomposition CynthHL Structural elaboration Constraints RTL Automatically synthesized RTL Implementation Cycle timing Resource allocation C++ to HDL Gate Logic synthesis Gates
Design Exploration • Typical architectural questions: • What goes in hardware? software? • How many data path elements? • How wide should data paths be? • What protocols should be used? • How deep should pipelines be? • More interesting: • What’s the lowest distortion for a given die size? • What’s the minimum area for a target frame rate? • How much can I increase the signal-to-noise ratio with a 10% area increase?
Design Exploration • These are all speed vs. area tradeoffs • Speed: latency, throughput • Area: how much parallel hardware • Answers aren’t available until RTL has been produced • Most answers require multiple implementation data points => Evaluating an architectural decision is very expensive
Starting point 386 lines C++/ESC Module, testbench in ESC: Input key & block length, then key Input plain-text block Output encrypted block Goal: fastest design in minimum area Design exploration Unroll loops Enabled constant propagation Increased number of FSM states Decrease latency Increase functional units Decrease number of FSM states Result 3,917 lines of RTL 32 functional units 5 ROMs 100 registers AES Encryption Algorithm • Net result: • 5x speed-up • 1.2x size increase
Image Compression Algorithm • Computationally intensive • 753 lines of C++/ESC • Memory-intensive • Hard speed constraint • 15 ms/frame • 8 cycles/pixel @ 17ns clock • I/O interface is not defined • Includes testbench and golden results
Initial analysis • Throughput requirement faster than latency allows • Suggests some form of pipelining will be needed • Loops in algorithm should be restructured to have only one inner loop • Critical performance issue • Memory usage, not operations (+, *, etc.) • Pipelining makes memory usage more intense Input, rgb2yc Vertical filter Horizontal filter Output
Design Exploration • With each architecture • Synthesize one or more RTL implementations • Use CynthHL’s output to determine critical issues (memory vs. operations) • Verify with same testbench • Net result: • 10,682 lines Verilog/RTL • 108 functional units • 9 RAMs, 1 ROM • 160 registers • Run time: 33.8s • Merge loops • Modify memory architecture • Pipeline • Verify each transformation 15 ms /frame
Synergy • General-purpose programming language • High-level synthesis • Together, high-level design is a reality • Separately, they are just curiosities • SystemC • + • CynthHL • = • High-level Design
High-Level Modeling General-Purpose Languages, High-Level SynthesisJohn Sanguinetti
CynthHLProduct Status • Currently in Beta • Beta released January 2002 • Available Today – Everything you need: • Synthesis to “correct gates” • Design exploration • Predictable timing helps timing closure • Beta period being used to improve usability in “real world” design and verification flows • Official product announcement 2H2002
Integrated Design and Verification Environment • System-level model used for verification • Spend time at the algorithmic level to “get it right” • Reuse verification environment at lower levels • Same TB used for algorithm and synthesized RTL • System-level model used for design • Once the architecture is verified, automatically create RTL implementation(s) • Explore trade-offs between design goals by creating multiple implementations • Automated path to CORRECT gates