1 / 151

Training Software Version v2.2

Training Software Version v2.2. Training Overview. Key Concepts Edit and Compile Source Create Architecture Map to Architecture Schedule Operations Build the RT-Level Verify the Design Create and Use a User Library Supported C subset. Key Concepts.

finn-burris
Download Presentation

Training Software Version v2.2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Training Software Version v2.2

  2. Training Overview Key Concepts Edit and Compile Source Create Architecture Map to Architecture Schedule Operations Build the RT-Level Verify the Design Create and Use a User Library Supported C subset

  3. Key Concepts Edit and Compile Source Create Architecture Map to Architecture Schedule Operations Build the RT-Level Verify the Design Create and Use a User Library Supported C subset

  4. ROM Custom Logic RAM Embedded Core I/O Logic Custom Logic Electronic Product Design High-Complexity Applications Time-2-Market Time-2-Profit Power-Efficient, High-Performance, Cost-Effective, Flexible Architectures Low-Cost Low-Power Deep-Sub-Micron Silicon Assembly

  5. branch logic ALU MULT IN OUT RAM ROM Behavioral Synthesis algorithm Layout Generation gates layout Design Flow Algorithm Architecture RT-level Synthesis Abstraction Levels architecture Gates Layout

  6. Time-to-Market • Raising the abstraction level • Code compactness • Algorithmic description FIR filter • 100 lines of C code • RT-level description FIR filter • 5,200 lines of HDL • “Blackbox” • Better simulation performance • Easier design transfer and re-use BEHAVIOR ( C SUBSET ) RT-LEVEL

  7. Flexibility • Optimal area for application • Low-power design • More processing power/throughput • Same starting point: • FPGA • ASIC $ : cheaper custom solution BEHAVIOR ( C SUBSET ) RT-LEVEL

  8. # gates # gates 8000 80 70 4000 before after before after Flexibility: Example Behavioral synthesis RT-level synthesis Reduction:50% Reduction:13%

  9. Application Area • Data path elements are shared over clock cycles • Moderate decision making is involved Controller FSM Control/ Flags Control Data Path Cores Register Files RAM/ROM Addr/Data Regs Address/ Data

  10. Typical Applications • ASSP: Application Specific Standard Product • Relatively complex data/signal processing • GSM, DECT, wireless LAN • Speech recognition, compression, processing • JPEG, image processing • Portable medical electronics • ...

  11. Design Constraints • Design considerations: • Algorithm level • Frame rate • Frame = 1 execution of your algorithm • 1 frame consumes 1 value for each input, produces 1 value for each output • e.g. GSM LTP: 1 data frame (160 samples) every 20 ms • Maximal latency = delay on signal caused by the algorithm • RT-level • Clock rate • e.g. 50 MHz clock • Cycle budget = Clock rate / Frame rate • The amount of clock cycles available to execute one frame • e.g. for GSM LTP: 4000 cycles

  12. Target Processor Architecture branch logic ALU MULT IN OUT RAM ROM

  13. Structure of a Cluster

  14. System Specification Embedded Software Datapath Resources (arithmetic, memory) Legacy HDL Vendor HDL HW Resource Library ANSI C HW Resource Library HW Resource Library Create Architecture Edit/Compile Map to Architecture Source Code Tuning Architecture Optimization Schedule Operations Performance Analysis Build RTL code Logic Synthesis FPGA ASIC Internal Design Flow

  15. C Resource Libraries 1. 2. Architecture Creation pragmas Compilation Central Data Structure 3. 5. pragmas pragmas Mapping 4. Building Scheduling VHDL Verilog pragmas Internal Design Flow(2)

  16. Defaults, Options and Pragmas • Increasing order of priority: • Tool defaults • Option settings (if any) • Pragmas for specific cases

  17. Hardware Libraries • Default library • Supplied by Frontier • Two versions: - for Xilinx FPGA flow - for ASIC flow • Sufficient to map all supported C operators • User libraries • Existing hardware blocks • Custom hardware blocks for better speed/area/power trade-off

  18. Project organization artd_cache

  19. Key Concepts Create Architecture Map to Architecture Schedule Operations Build the RT-Level Verify the Design Create and Use a User Library Supported C subset Edit and Compile Source

  20. Key Concept • In a first step, A|RT Designer will convert in anintelligent way your behavior description of your algorithm into an internal representation. intelligent -> it checks whether the code is C/C++ compliant, if there are non-synthesizable constructs present • You can describe your algorithm using C/C++ optionally enriched by A|RT Library fixed-point types in C-style or SystemC-style. • To use A|RT Library types: #include <fxp.h>; /* C/C++ version*/ #include <sc_fxp.h>; /*SystemC version*/

  21. C Compiler optimizations Dead code elimination • Constant propagation • only for temporary expressions with constants • b = a + 2 + 3 => b = a + 5

  22. Name of the function to be compiled Default= last function in C source C Compiler Options (1) • Specification of the include search path • Multiple entries are separated by semicolon • Specification is relative to project subdirectory Example: /home/john/include;..;$MY_INCLUDES/include Macros to be defined/undefined Semicolon separated Example for Defines: FXPTRACE;MY_DEFINE=1 • Enables C test bench generation • I/O can be read in binary or decimal format Saves the source file obtained after CPP processing Enables strict ANSI C compliance

  23. C Compiler Options (2) • Data flow analysis identifies and accurately represents the parallelism of the C-code by - determining the exact data - dependencies between the variables to achieve : - better performance - optimal use of target processor

  24. Data Flow Analysis void calc_address(const T_AD i, const T_AD j, T_AD& address) { address = const1*i + const2*j; } void mydesign(…) { ... for (i=0; i<16; i++) { for (j=0; j<16; j++) { calc_address(i,j,address); a = A[address]; ….. // calculation of b A[address] = b; } } } DFA will check whether or not write address is different from read address for every iteration! This will determine how much loop folding can be performed.

  25. void array (const Int<16> in[4], Int<16> out1[4], Int<16> out2[4] ) { #ifdef __SYNTHESIS__ #pragma OUT out1out2 #endif for(i=0;i<4;i++){ out1[i]=in[i]-i; out2[i]=in[i]+i; } } void addsub (const Int<8> a, const Int<8> b, Int<8>& c, Int<8>& d) { #ifdef __SYNTHESIS__ #pragma OUT c d #endif c=a+b; d=a-b; } Pragmas in C Source • #pragma OUT <var_name_1> <var_name_2> … • Used to indicate function arguments that are strictly outputs • This is not checked by the compiler ! Example:

  26. Key Concepts Edit and Compile Source Map to Architecture Schedule Operations Build the RT-Level Verify the Design Create and Use a User Library Supported C subset Create Architecture

  27. Key Concept • In this step, you instantiate the hardware resources that you need to define the target architecture you want to use • You only have to instantiate the central elements of hardware clusters(auxiliary resources like register files, muxes and tristate buffers are automatically generated at a later step) : • Cores (ALU, MULT, …) • Memories (RAM, ROM, …) • Ports (INPORT, OUTPORT) • You also instantiate one type of controller

  28. Architecture Model

  29. Instantiating Resources • Resources can be instantiated from: • The default library: artd_library (for ASIC flow) or artd_xilinx_library( for Xilinx FPGA flow) • A user library • The libraries must have been selected in the Create Architecture options :

  30. Resources in the Default Library (1) • Cores • alu, alusat, • mult, multp, mac2, mac3 • acu • Memories • rom, ram • romctrl • dpram_r_w, dpram_r_rw, dpram_w_rw, dpram_rw_rw • dprom, dpromctrl

  31. Resources in the Default Library(2) • Ports • inport, inport_nohs, inport_noaddr, inport_noaddr_nohs • outport, outport_nohs , outport_noaddr, outport_noaddr_nohs • Controllers • mbc_11, mbc_12, mbc_22, mbc_23

  32. Pragma Syntax Table • I : integer (e.g. 10) • IL: integerlist (e.g. [10,20,6] ) • IW: integer or wildcard (e.g. 10 or * or _) • C : quoted string (e.g. "acu") • CL: quoted stringlist (e.g. ["in1:8","in2:10"]) • EXPR : expression (e.g. _*_)

  33. Pragmas (1) • instantiate(C, C, C); • instantiate(“libraryName”, “resourceName”, “instanceName”); • This pragma instantiates a resource defined in a library • The default library is called artd_library or artd_xilinx_library • Multiple instances of the same resource can be created • EXAMPLE: • instantiate("artd_xilinx_library","multp","multp_1"); • instantiate("artd_library","mbc_12","ctrl"); • instantiate(”my_own_library",”multiplier",”mymult");

  34. Pragmas (2) • instantiate_function(C, C); • instantiate_function(“functionName”, “instanceName”); • This pragma instantiates a virtual resource, not defined in a library • All calls to the named function will be mapped on this virtual resource as single-cycle operations • Only a single function can be associated with a virtual resource • Allows design exploration without actually having to create a library element • EXAMPLE: • instantiate_function(”cordic",”cordic_1");

  35. reg_d reg_a reg_dx reg_dz reg_d reg_dz Pragmas (3) • merge_regfiles(CL, C); • merge_regfiles ([“registerfileName”], “newRegisterfileName”); • Merge a list of register files into a new register file with the specified name • May lead to less registers but possibly a longer schedule • EXAMPLE : • merge_regfiles(["reg_a_ram_1","reg_dx_acu_1"], ”addr_reg"); ram_1 ram_1 addr_reg acu_1 acu_1

  36. Pragmas (4) • set_regfileports(C,[IN,OUT], I); • set_regfileports(“regFileName”,IN|OUT, nrports); • This pragma allows you to generate multiport register files • This pragma overrules the default register file settings of one input port and one output port • EXAMPLE : • set_regfileports(”merged_reg",IN,2); • set_regfileports(”merged_reg",OUT,2); This will result in a multiport register file called “merged_reg’ with two input ports and two output ports

  37. Pragmas (5) • connect_bus(C, CL, CL); • Connect_bus(“busName”, [“writer”],[ “reader”]); • Allows you to define a bus and its connctions. • With this pragma you can restrict resources from writing to specific busses or you can merge a number of busses into one single bus. • By using multiple connect_bus pragmas you can define partial or a complete busnetwork. The outport of a resource that still has no bus connection after the last connect_bus pragma will automatically receive a private bus. • EXAMPLE : • connect_bus( “ram2_bus”,[“acu_2:dout”],[‘reg_a_ram_2:d0”,’reg_dx_acu_2:d0”]); Defines a bus called ‘ram2_bus” that is written to by the output of acu_2 and read by the address port of ram_2 and the first input port of acu_2

  38. Pragmas (6) • no_connection(C, CL); • No_connection(“writer”,[ “reader”]); • With this pragma you can restrict connections between one output of a resource (defined by the first argument!) and a list of inputs. • EXAMPLE : • no_connection( “romctrl_1:dout”,[‘reg_a_ram_2:d0”,”reg_dx_acu_2:d0”]); Using this pragma, no connection will be present between the output of romctrl_1 and the address register of ram_2 and the first input of acu_2

  39. Default Architecture • The following resources from the (ASIC)default library are automatically instantiated when a new project is created: • alu, mult • acu • romctrl • ram, rom • inport, outport • mbc_23

  40. Example Pragma File //INPORT and OUTPORT without address generation instantiate("artd_library","inport_noaddr","inport_1"); instantiate("artd_library","outport_noaddr","outport_1"); //ACU and ROMCTRL for RAM and ROM addressing instantiate("artd_library","acu","acu_ram"); instantiate("artd_library","acu","acu_rom"); instantiate("artd_library","romctrl","romctrl_ram"); instantiate("artd_library","romctrl","romctrl_rom"); //Cores and Memories instantiate("my_library","mac","my_mac"); instantiate("artd_library","rom","rom_1"); instantiate("artd_library","ram","ram_1"); //Controller instantiate("artd_library","mbc_23","ctrl"); //dedicate address generation cluster connect_bus(“bus_romctrl_rom”,[“romctrl_rom:dout”],[“reg_*_acu_rom:d0”]); connect_bus (“bus_dout_acu_rom”,[“acu_rom:dout”],[“reg_*_acu_rom:d0”,” reg_a_acu_rom:d0”]); no_connection(“acu_ram:dout”,[”reg_a_rom_1:*”]); no_connection(“romctrl_ram:dout”,[“reg_*_acu_rom:*”,”reg_a_rom_1:*”]);

  41. Views Architecture view:

  42. views • Architecture view • Graphical representation of the selected architecture • In this view you can select and highlight individual components and resources. You can also jump to the architecture report for a detailed textual overview

  43. Reports (1) Architecture Report :

  44. Reports (2) • Architecture report • Lists all selected resource instances and its registers • Lists for each instance/register: • input ports and connected register files/muxes • output ports and connected buses • Resources from the default library are listed with unspecified types and with their complete instructionset • Resources from user libraries are listed with types and instruction list as specified in the library

  45. Key Concepts Edit and Compile Source Create Architecture Schedule Operations Build the RT-Level Verify the Design Create and Use a User Library Supported C subset Map to Architecture

  46. Key Concepts • In the mapping step following tasks are performed: • Memory management:variables and temporary variables (introduced by the compilation step) are allocated to the available memory resources • Core resource assignment:operations from the design are assigned to corresponding core resources and translated in RT’s(register transfers) • Multiplexer introduction:muxes are introduced if more than 1 bus is connected to input of a register or if 2 or more variables with different types are transferred to that input over a bus connected to it

  47. Addressed by Controller Scalars RegFile ROMCTRL (Constants) Memory Management Access Speed Addressed by Data path RAM Arrays ROM INPORT/OUTPORT Area per Memory Location

  48. Core resource assignment • Resource assignment is completely detemined by a set of internal mapping rules and by user pragmas. • The rules are divided in two groups: • First set applies to the mapping of the core resources in the default library. This set of rules are transparent for the user but not accessable • The second set apply to the mapping of operations on user-defined resources and are an essential part of the pragmas of the corresponding user-defined library

  49. Mapping rules • Operations or instructions on resources from the standard library are handled as taking one clock cycle. Exception: MAC (has a pipeline register) • By default, operations and implicit operations are mapped to the first instance of a resource that can execute the operation • First means: first instantiated in pragma file of previous step • Implicit operations: - ROM/RAM addressing: Initialize address, compute next address - FOR loops: Initialize loop counter, update, test - Implicit constants for all instances

  50. Multiplexer introduction • In a last stage of the mapping step, muxes are introduced were needed. • Their function is threefold: bus selection data alignment type manupilation: performed by coding cast operations

More Related