290 likes | 308 Views
Synthesis of Transaction-Level Models to FPGAs. Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles. Outline. Transaction-level model (TLM) SystemC TLM Metropolis Meta Model Synthesis from TLM
E N D
Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department University of California, Los Angeles
Outline • Transaction-level model (TLM) • SystemC TLM • Metropolis Meta Model • Synthesis from TLM • RDR/MCAS: our existing architectural synthesis approach • xPilot: Ongoing synthesis infrastructure for TLM
Outline • Transaction-level model (TLM) • SystemC TLM • Metropolis Meta Model • Synthesis from TLM • RDR/MCAS: our existing architectural synthesis approach • xPilot: Ongoing synthesis infrastructure for TLM
SystemC Framework • SystemC history • OO system/HW modeling and simulation • SystemC under development by CAD vendors/researchers • Synopsys • Frontier Design • CoWare (Belgium) • Released to public Sept. ‘99 • Open source distribution @ www.systemc.org • Version 2 out July ‘01
Channels and Modules • Basic building blocks: • Module (class) instances, communicating via channel (class) instances • Modules’ functionality coded as concurrent processes • Processes communicate via channels or events
Primitive Channels in SystemC Library • Ordinary signal (wire) of type <T> • Fill in data type T when instantiated • Point-to-point or multi-point (1 writer, n readers) • Signal bus (arbitrary width) • FIFO, for producer/consumer connection • Pseudo-channels • Mutex & semaphore, for interprocess sync • Accessed using channel syntax • Complex “hierarchical” channels composed of primitive channels, processes, modules
Events and Processes • Events: abstract occurrences used for • Process triggering (like VHDL sensitivity list) • Channel communication • Interprocess synchronization • Process can call wait() to block on event • Event occurrence tells simulator to schedule simulation of relevant process • Processes execution • Not called directly from your code • Triggered for simulation by events on ports, channels, or explicit named events • Registered in constructor of enclosing module (associate method with events) • Thread process → infinite loop • Must call wait() to lose control • Method process → runs to completion • Less scheduling overhead
Data Types in SystemC • SystemC supports • Native C/C++ Types • SystemC Types • SystemC Types • Data type for system modeling • 2 value (‘0’,’1’) logic/logic vector • 4 value (‘0’,’1’,’Z’,’X’) logic/logic vector • Arbitrary sized integer (Signed/Unsigned) • Fixed Point types (Templated/Untemplated) • Objective: to reflect HW registers & ALU operations
Functional Level and RTL Modeling in SystemC • Functional level • Sequential, algorithmic, software-like • Explore HW/SW architectures, proof of algorithms, performance modeling & analysis • Register transfer level • Complete detailed functional description of hardware • Every register, bus, bit for every clock cycle • Use C++ switch/case for FSM implementation • At this point, can switch to HDL, but staying in SystemC leverages test benches • Prepare for HW synthesis step by using only synthesizable constructs
Transaction Level Modeling in SystemC • Transaction level • Model includes architectural components • Maintain component interface accuracy • E.g., buses modeled as channels (read/write operations) • Behavioral style inside a component • Simulates 100-10,000x faster than RTL • Provide execution platform for SW development
TLM – Raise the Level of Architectural Modeling • What is TLM? • Communication uses function calls • burst_read(char* buf, int addr, int len); • Why is TLM interesting? • Simulation: Fast and compact • Integrate HW and SW models • Early platform for SW development • Early system exploration and verification • Verification reuse • Synthesis … • Reference: www.systemc.org
Typical Design Flow Using TLM • Functional model • Captures system behaviour • TLM, Transaction Level Model • Bus transactions • Accurate interaction with SW portion • Simulates rapidly • Can create TLM model initially
Introduction of Metropolis • A UCB and GSRC project, http://www.gigascale.org/metropolis/ • Platform-based design [ASV] • Platforms have sufficient flexibility to support a series of applications/products • Choose a platform by design space exploration • Above two require models to be reusable • Orthogonalization of concerns • Computation vs. Communication • Behavior vs. Coordination • Behavior vs. Architecture • Capability vs. Cost
Metropolis Meta Model • A combination of imperative program and declarative constraints • Imperative program: • objects (process, media, quantity, statemedia) • netlist • await • block and label • interface function call • quantity annotation • Declarative constraints • Linear Temporal Logic (LTL) • (synch) • Logic of Constraints (LOC)
MyFncNetlist P1 P2 M Env2 Env1 A Metropolis Design Tutorial MyMapNetlist
MyArchNetlist … MyArchNetlist … mP1 mP1 mP2 mP2 … Bus Arbiter Bus Arbiter Bus Bus T2Y read() Y2T write() Th,Wk Mem Mem Cpu Cpu OsSched OsSched A Metropolis Design Tutorial MyMapNetlist B(P1, M.write) <=> B(mP1, mP1.writeCpu); E(P1, M.write) <=> E(mP1, mP1.writeCpu); B(P1, P1.f) <=> B(mP1, mP1.mapf); E(P1, P1.f) <=> E(mP1, mP1.mapf); B(P2, M.read) <=> B(P2, mP2.readCpu); E(P2, M.read) <=> E(mP2, mP2.readCpu); B(P2, P2.f) <=> B(mP2, mP2.mapf); E(P2, P2.f) <=> E(mP2, mP2.mapf); MyFncNetlist P1 P2 M Env2 Env1
Meta model language Abstract syntax trees LOC checking SystemC simulation Meta model debugger SPIN interface Outlook of the First Metropolis Release A design tutorial • http://www.gigascale.org/metropolis/ • Sample MoC: • multi-media (Yapi, TTL) • Synchronous • Sample architectural libraries: • coarse-simple cpu, bus, memory, arbiters • time quantity Meta model infrastructure Front end Back end3 Back end2 Back endN Back end1
TLM Conclusions • SystemC is the defacto system-level-design standard • Pushed by many CAD tool vendors • Used widely in industry and academia • E.g., Intel handhold system project [ICCAD’04] • Unified language to model a system in different levels • Improving path to HW synthesis from SystemC source code • Fits with trend to take system design to higher level • Metropolis is a novel academic framework of model of computation • Capable of representing TLM as well • Provides a comprehensive starting point of synthesis
Outline • Transaction-level model (TLM) • SystemC TLM • Metropolis Meta Model • Synthesis from TLM • xPilot: our ongoing synthesis infrastructure for TLM • RDR/MCAS: our existing architectural synthesis approach
xPilot: TLM to RTL Synthesis Flow • Arch-Independent passes • SSDM Checking • Loop unrolling/pipelining • Strength reduction/Bitwidth analysis • Speculative-execution transformation … TLM in SystemC/Metropolis Frontend SSDM • Arch-dependent passes • Memory analysis/allocation • Scheduling/Binding/Memory analysis/allocation • Register/port binding • Traditional/Low power/RDR-pipe or Placement driven … RTL • Arch-generation passes: RTL/constraints generation • Verilog/VHDL/SystemC • Altera/Xilinx • General/Synopsys/Magma … FPGAs
Meta model language Abstract syntax trees IP Assembly Predictable RTL Synthesis … FPGA ASICS Integration xPilot with Metropolis Meta model infrastructure Front end Synthesis … SPIN Interface LOC Checking SystemC Simulation xPilot/SSDM Compilation for RP HW Implementation … Latency Insensitive Design RDR/MCAS GALS … Extended Instruction ReconfigurableCoprocessor ReconfigurableInterconnect RTL Handoff Simulation Timing Constraints Physical Constraints RTL HW implementation IP Library
if (cond1) bb1(); else bb2(); bb3(); switch (test1) { case c1: bb4(); break; case c2: bb5(); break; case c3: bb6(); break; } bb7() T cond1 bb1() F bb2() bb3() test1 c3 c1 c2 bb4() bb5() bb6() bb7() SSDM Zoomed In – CDFG • 2-level CDFG representation • 1st level: control flow graph • 2nd level: data flow graph
SSDM Features Different from Software IR • Top-level: netlist of concurrent processes • Process port/interface semantics • FIFO: FifoRead() / FifoWrite() • BUFF: BuffRead() / BuffWrite() • Memory: MemRead() / MemWrite() • Bit vector manipulation • Bit extraction / concatenation / insertion • Bit-width property for every value • Cycle-level notation • Scheduling / binding information / delay
Our Architectural Synthesis Approaches – RDR / MCAS • Consideration of multi-cycle communication during architectural (or behavioral) synthesis • Regular Distributed Register (RDR) micro-architecture [Cong et al, ISPD’03] • Highly regular • Direct support of multi-cycle on-chip communication • MCAS: Architectural Synthesis for Multi-cycle Communication • Efficiently maps the behavioral descriptions to RDR uArch • Integrates architectural synthesis (e.g. resource binding, scheduling) with physical planning
Pipeline Register Station (PRS) 3 1 2 4 PRS PRS FSM FSM Reg. File LCC LCC 2 1 V channel H channel PRS PRS FSM Adaptor LCC IP Library 3 4 RDR/MCAS: Support for Heterogeneous Integration with Multi-cycle Communication & Automatic Interconnect Pipelining • Distribute registers to each “island” • Choose the island size such that • Single cycle for intra-island computation and communication • Multi-cycle communication between islands • Support interconnect pipelining • Inter-island pipeline register station (PRS) for global communications • PRS performs autonomous store-and-forward • MCAS: Multi-cycle architectural synthesis integrated with global placement • Experimental results • MCAS vs. Conventional flow: • 36% reduction in clock period and • 30% reduction in total latency • MCAS-Pipe vs. MCAS: • 28.8% long global wirelength reduction • 19.3% total wirelength reduction • Can also support IP integration using latency insensitive technique [Carloni, ICCAD’99]
Synthesis Flow: MCAS-Pipe System C / VHDL • Global interconnect sharing • Enable multiple data communications to share one physical link (a wire with pipeline registers) CDFG generation CDFG Resource allocation & Functional unit binding ICG Scheduling-driven placement Locations Placement-driven rescheduling & rebinding Global interconnect sharing Register and port binding Datapath & FSM generation RTL VHDL & Floorplan constraints
Related Publications • Regular distributed register (RDR) architecture and MCAS synthesis algorithms • ISPD’03, ICCAD’03 • RDR-Pipe and MCAS-Pipe synthesis algorithms • DAC’04 • Lopass: high-level synthesis for low-power FPGAs • ISLPED’03 • Multiplexor optimization through register/port binding • ASPDAC’04 • Bitwidth-aware scheduling and binding algorithms • ASPDAC’05
Conclusions • Higher level abstraction is needed in current SO(P)C design flow • SystemC becomes the SLD standard, esp., TLM is widely used • Metropolis is a platform-based design framework • It is time to build new generation of behavioral synthesis system from TLM • xPilot: • Ongoing project • An architectural synthesis infrastructure from TLM to RTL (FPGAs)