1 / 21

ARC: A Performance Analysis Environment for Adaptive Computing Systems

ARC: A Performance Analysis Environment for Adaptive Computing Systems. Ranga Vemuri Jeff Walrath Digital Design Environments Laboratory ECECS Department, ML. 30 University of Cincinnati Cincinnati, Ohio 45221-0030 Phone: (513)-556-4784 Fax: (513)-556-7326 Email: ranga.vemuri@uc.edu

dieter
Download Presentation

ARC: A Performance Analysis Environment for Adaptive Computing Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ARC: A Performance Analysis Environment for Adaptive Computing Systems Ranga Vemuri Jeff Walrath Digital Design Environments Laboratory ECECS Department, ML. 30 University of Cincinnati Cincinnati, Ohio 45221-0030 Phone: (513)-556-4784 Fax: (513)-556-7326 Email: ranga.vemuri@uc.edu Web: http://www.ececs.uc.edu/~ddel

  2. What need does ARC address? ACS Compilers ACS Applications ACS Design Tools ACS Performance Modeling & Analysis ACS Operating Systems ACS Developers

  3. ACS Performance Analysis Field Programmable Device Configure Reconfigure Reconfigure Power Time

  4. ACS Abstractions Adaptive Software Layers Adaptive Systems (H/w and S/w) Reconfigurable Computer Reconfigurable Devices

  5. Power Time ACS Applications Config 1 Forward Discrete Cosine Transform Config 2 Config 0 Adaptive System for Imaging Applications Quantization No More Blocks More Blocks Config 3 Config 5 Config 4 Zig-Zag Transform Huffman Encoding Run Length Encoding

  6. ARC System ACS Element Performance Models (PDL+) ACS Structure (PNF, GUI, VHDL, EDIF..) ARC ACS APPLICATION (C++) API RESULTS DATABASE VISUALIZATION TOOL ( Gnuplot/Khoros)

  7. ACS Element Performance Models Module a Module d Module b Module c Mode c1 Ports port a b W: carrier X: carrier Carriers Module c Mode c2 d c Y: carrier

  8. ACS System Performance Model e c d a b

  9. Summary of ARC • Performance Description Language, PDL+ • ACS Architecture Specifications (GUI, PNF, VHDL..) • ARC Software (Compiler, Composer, Evaluator and • Scheduler) • API for Application Interaction • Visualization Interface.

  10. Illustrative Example • Simple LUT-style FPGA architecture • RC with multiple FPGAs and fixed interconnect. • ACS with host processor and RC coprocessor. • Hardware/Software Tasks executing on the ACS. Typical Performance Related Questions: 1. How does a proposed hardware/software binding of tasks perform? 2. Which member of an FPGA family should be selected to meet desired throughput?

  11. Simple FPGA Architecture CLB CLB FPGA Inputs FPGA Outputs CLB CLB

  12. 0 1 1 0 LUT-Based FPGA Cell LUT Bit Mode : <<[0, 1, 1, 0], 1>> Flip Flop Inputs Output 0 LUT Bits 1 k MUX Look-Up Table Function Generator • Delay Through The CLB

  13. FPGA Performance Model module fpga : fpga_mode := [] modules clbs{}: clb; ports inputs{}: ioport; outputs{}: ioport; rules clbs{x}'mode = mode[x'id]; clbs{}'trigger = trigger; attributes primitive id: int; time: real; qdynamic clock_period: real := 0.0; rules inputs{}'time = 0; time = max_real(foreach x:clb in clbs {x'time}); clock_period = max_real([time, curr clock_period]); endmodule; port ioport attributes time: real; endport; module lut_bit : boolean := 0 attributes primitive id: int; endmodule; type mode_list[]: boolean; clb_mode: record fg_mode : mode_list; ff_mode : boolean; endrecord; fpga_mode[]: clb_mode; endtype; module function_generator : mode_list:=[] modules lut{}: lut_bit; ports inputs{}: ioport; output : ioport; attributes primitive delay_per_lut_bit: real; delay: real; rules lut{x}'mode = mode[x'id]; lut{x}'trigger = trigger; delay = #lut * delay_per_lut_bit; output'time = max_real( foreach x:ioport in inputs {x'time}) + delay; endmodule; module flip_flop : boolean := 0 ports input, output: ioport; attributes primitive delay: real; time: real; rules output'time = 0.0; time = input'time + delay; endmodule; module multiplexer ports input1, input2, output: ioport; attributes primitive delay: real; select: boolean; rules output'time = if select then input1'time + delay else input2'time + delay endif; endmodule; module clb : clb_mode := <[],0>> modules fg : function_generator; ff : flip_flop; mux: multiplexer; ports inputs{}: ioport; output : ioport; attributes primitive id: int; time: real; rules fg'mode = mode.fg_mode; ff'mode = mode.ff_mode; ff'trigger = trigger; fg'trigger = trigger; mux'select = curr ff'mode; time = ff'time; endmodule;

  14. Speed-Grade Selection Analysis

  15. RC Coprocessor MEMORY MEMORY FPGA FPGA • Maximum clock • speed for a given • board configuration. • Critical path delay • (clock speed) for a • given configuration. INTERCONNECT FPGA FPGA MEMORY MEMORY

  16. ACS Architecture Processor Memory Reconfigurable Co-Processor

  17. Software Task • Hardware Task • Channel Task1 Task2 SW HW • Instruction Sequence Channel Task3 Task4 SW HW • Configuration Data • Time Steps Task5 Task6 HW SW • Bandwidth Task7 SW Codesign Application

  18. Codesign Tradeoff Analysis

  19. ARC Demonstrations • Small Scale Examples: • LUT-based FPGAs • Context-switching FPGAs • Programmable interconnects. • Simple processor and memory models. • ACS Software performance models. • Large Scale Demonstrations: • Xilinx 4000 series FPGA performance model. • AMS WildForce RC models for use in the SPARCS • partitioning and synthesis system.

  20. SPARCS Synthesis and Partitioning System for RCs Behavioral-Level Specification RT-Level Specification Gate-Level Specification High-Level Synthesis (UC) Logic Synthesis (Synopsys) Layout Synthesis (UC/Xilinx) Bitstreams Partitioning System Temporal Partitioning Spatial Partitioning Light-Weight Behavioral/Logic/Layout Synthesis Algorithms Architecture Specification ARC - ACS Performance Analysis AMS WildForce Board

  21. Further Information…. Visit http://www.ececs.uc.edu/~ddel/arc.html Would like software? Have ideas for demonstrations? Please Contact: ranga.vemuri@uc.edu 513-556-4784

More Related