1 / 46

ATHENa - Automated Tool for Hardware EvaluatioN

ATHENa is an open-source tool written in Perl for automated generation of optimized results across multiple hardware platforms. It supports synthesis, implementation, and timing analysis for FPGA designs. Download scripts and configuration files, rank designs, and verify designs through simulation in batch mode. Optimize options for tools and clock frequency. Seamlessly integrated with a database for easy extraction and tabulation of results.

lauralane
Download Presentation

ATHENa - Automated Tool for Hardware EvaluatioN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 545 Lecture 12 ATHENa - Automated Tool for Hardware EvaluatioN

  2. Resources • ATHENa website • http://cryptography.gmu.edu/athena

  3. ATHENa – Automated Tool for Hardware EvaluatioN Supported in part by the National Institute of Standards & Technology (NIST)

  4. ATHENa Team John MS CpEstudent Venkata “Vinny” MS CpEstudent Ekawat “Ice”PhD CpEstudent Marcin PhD ECEstudent Michal PhD exchange student from Slovakia Rajesh PhD ECEstudent

  5. http://cryptography.gmu.edu/athena ATHENa – Automated Tool for Hardware EvaluatioN • Benchmarking open-source tool, • written in Perl, aimed at an AUTOMATED generation of • OPTIMIZED results for • MULTIPLE hardware platforms Currently under development at George Mason University.

  6. Why Athena? "The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals. 
Athena Goddess of Wisdom was known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest.” from "Athena, Greek Goddess of Wisdom and Craftsmanship"

  7. Generation of Results Facilitated by ATHENa “working” with ATHENa… vs. old days…

  8. Basic Dataflow of ATHENa User FPGA Synthesis and Implementation 6 5 3 Ranking of designs 2 Database query HDL + scripts + configuration files Result Summary + Database Entries ATHENa Server 1 HDL + FPGA Tools Download scripts andconfiguration files8 4 Designer Database Entries Interfaces+ Testbenches 8 0

  9. constraint files • configuration files • testbench • synthesizable source files database entries (machine- friendly) result summary (user-friendly)

  10. ATHENa Major Features (1) • synthesis, implementation, and timing analysis in batch mode • support for devices and tools of multiple FPGA vendors: • generation of results for multiple families of FPGAs of a given vendor • automated choice of a best-matching device within a given family

  11. ATHENa Major Features (2) • automated verification of designs through simulation in batch mode • support for multi-core processing • automated extraction and tabulation of results • several optimization strategies aimed at finding • optimum options of tools • best target clock frequency • best starting point of placement OR

  12. batch mode of FPGA tools ease of extraction and tabulation of results Text Reports, Excel, CSV (Comma-Separated Values) optimized choice of tool options GMU_optimization_1 strategy Generation of Results Facilitated by ATHENa vs.

  13. Relative Improvement of Results from Using ATHENa Virtex 5, 256-bit Variants of Hash Functions Ratios of results obtained using ATHENa suggested options vs. default options of FPGA tools

  14. Other (Somewhat) Similar Tools ExploreAhead (part of PlanAhead) Design Space Explorer (DSE) Boldport Flow EDAx10 Cloud Platform

  15. Distinguishing Features of ATHENa • Support for multiple tools from multiple vendors • Optimization strategies aimed at the best possible • performance rather than design closure • Extraction and presentation of results • Seamless integration with the ATHENa database of results

  16. Traditional Development and Benchmarking Flow Informal Specification Test Vectors Manual Design Functional Verification HDL Code Post Place & Route Results Manual Optimization FPGA Tools Timing Verification Netlist

  17. Extended Traditional Development and Benchmarking Flow Informal Specification Test Vectors Manual Design Functional Verification HDL Code Post Place & Route Results Option Optimization GMU ATHENa FPGA Tools Timing Verification Netlist

  18. How To Start Working With ATHENa?One-Time Tasks Download and unzip ATHENa http://cryptography.gmu.edu/athena/ Read the Tutorial! Install the Required Tools (see Tutorial - Part 1 – Tools Installation) Run ATHENa_setup

  19. How To Start Working With ATHENa?Repetitive Tasks Prepare or modify your source files & source_list.txt Modify design.config.txt + possibly other configuration files Run ATHENa

  20. design.config.txtYour Design # directory containing synthesizable source files for the project SOURCE_DIR = <examples/sha256_rs> # A file list containing list of files in the order suitable for synthesis and implementation # low level modules first, top level entity last SOURCE_LIST_FILE = source_list.txt # project name # it will be used in the names of result directories PROJECT_NAME = SHA256 # name of top level entity TOP_LEVEL_ENTITY = sha256 # name of top level architecture TOP_LEVEL_ARCH = rs_arch # name of clock net CLOCK_NET = clk

  21. design.config.txtTiming Formulas #formula for latency LATENCY = TCLK*65 #formula for throughput THROUGHPUT = 512/(TCLK*65)

  22. design.config.txtApplication & Optimization Target # OPTIMIZATION_TARGET = speed | area | balanced OPTIMIZATION_TARGET = speed # OPTIONS = default | user OPTIONS = default # APPLICATION = single_run | exhaustive_search | placement_search | frequency_search | # GMU_Optimization_1 | GMU_Xilinx_optimization_1 APPLICATION = single_run # TRIM_MODE = off | zip | delete TRIM_MODE = zip

  23. design.config.txtFPGA Families # commenting the next line removes all families of Xilinx FPGA_VENDOR = xilinx #commenting the next line removes a given family FPGA_FAMILY = spartan3 # FPGA_DEVICES = <list of devices> | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_SYN_FREQ = 120 REQ_IMP_FREQ = 100 MAX_SLICE_UTILIZATION = 0.8 MAX_BRAM_UTILIZATION = 0.8 MAX_MUL_UTILIZATION = 1 MAX_PIN_UTILIZATION = 0.9 END FAMILY END VENDOR

  24. design.config.txtFPGA Families # commenting the next line removes all families of Altera FPGA_VENDOR = altera #commenting the next line removes a given family FPGA_FAMILY = Stratix III # FPGA_DEVICES = <list of devices> | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_IMP_FREQ = 120 MAX_LOGIC_UTILIZATION = 0.8 MAX_MEMORY_UTILIZATION = 0.8 MAX_DSP_UTILIZATION = 0 MAX_MUL_UTILIZATION = 0 MAX_PIN_UTILIZATION = 0.8 END FAMILY END VENDOR

  25. Library Files device_lib/xilinx_device_lib.txt device_lib/altera_device_lib.txt • Files created during ATHENa setup • Characterize FPGA families and devices available in the version of Xilinx and Altera tools installed on your computer • Currently supported tool versions: • Xilinx WebPACK from 9.1 to 14.7 • Xilinx Design Suite from 11.1 to 14.7 • Altera Quartus II Web Edition from 8.1 to 14.0 • Altera Quartus II Subscription Edition from 9.1 to 14.0 • In case a library for a given version not available yet, use a library from the closest available version

  26. Library Filesdevice_lib/xilinx_device_lib.txt VENDOR = Xilinx #Device, Total Slices, Block RAMs, DSP, Dedicated Multipliers, Maximum User I/O Pins ITEM_ORDER = SLICE, BRAM, DSP, MULT, IO FAMILY = spartan3 xc3s50pq208-5, 768, 4, 0, 4, 124 xc3s200ft256-5, 1920, 12, 0, 12, 173 xc3s400fg456-5, 3584, 16, 0, 16, 264 xc3s1000fg676-5, 7680, 24, 0, 24, 391 xc3s1500fg676-5, 13312, 32, 0, 32, 487 END_FAMILY FAMILY = virtex5 xc5vlx30ff676-3, 4800, 32, 32, 0, 400 xc5vfx30tff665-3, 5120, 68, 64, 0, 360 xc5vlx30tff665-3, 4800, 36, 32, 0, 360 xc5vlx50ff1153-3, 7200, 48, 48, 0, 560 xc5vlx50tff1136-3, 7200, 60, 48, 0, 480 END_FAMILY

  27. Result Filesreport_resource_utilization.txt xilinx : spartan3 +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ | default | xc3s200ft256-5* | 1 | 142 | 3 | 74 | 3 | 4 | 33 | 7 | 58 | 0 | 0 | 20 | 11 | +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ xilinx : spartan6 +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ | default | xc6slx9csg324-3* | 1 | 41 | 1 | 22 | 1 | 4 | 6 | 0 | 0 | 9 | 56 | 20 | 10 | +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ xilinx : virtex5 +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ | default | xc5vlx20tff323-2* | 1 | 101 | 1 | 56 | 1 | 4 | 15 | 0 | 0 | 9 | 37 | 20 | 11 | +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ xilinx : virtex6 +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+ | default | xc6vlx75tff784-3* | 1 | 44 | 1 | 21 | 1 | 4 | 1 | 0 | 0 | 9 | 3 | 20 | 5 | +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+

  28. Result Filesreport_timing.txt REQ SYN FREQ - Requested synthesis clk freq. SYN FREQ – Achieved synthesis clk. freq. REQ SYN TCLK - Requested synthesis clk period SYN TCLK – Achieved synthesis clk. period REQ IMP FREQ - Requested implement. clk freq. IMP FREQ – Achieved implement. clk. freq. REQ IMP TCLK - Requested implement. clk period IMP TCLK – Achieved implement clk. period LATENCY - Latency [ns] THROUGHPUT – Throughput [Mbits/s] TP/Area - Throughput/Area [(Mbits/s)/CLB slices Latency*Area – Latency*Area [ns*CLB slices] xilinx : spartan3 +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc3s200ft256-5* | 1 | default | 207.370 | default | 4.822 | default | 112.448 | default | 8.893 | 17.786 | 449.792 | 6.078 | 1316.164 | +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : spartan6 +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc6slx9csg324-3* | 1 | default | 75.751 | default | 13.201 | default | 78.119 | default | 12.801 | 25.602 | 312.476 | 14.203 | 563.244 | +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc5vlx20tff323-2* | 1 | default | 156.347 | default | 6.396 | default | 126.952 | default | 7.877 | 15.754 | 507.808 | 9.068 | 882.224 | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc6vlx75tff784-3* | 1 | default | 158.053 | default | 6.327 | default | 135.410 | default | 7.385 | 14.770 | 541.638 | 25.792 | 310.170 | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+

  29. Result Filesreport_options.txt COST TABLE - parameter determining the starting point of placement Synthesis Options – options of the synthesis tool Map Options – Options of the mapping tool PAR Options – Options of the place & route tool xilinx : spartan3 +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ | default | xc3s200ft256-5* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ xilinx : spartan6 +---------+------------------+-----+------------+------------------------------+---------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+------------------+-----+------------+------------------------------+---------------+--------------+ | default | xc6slx9csg324-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | +---------+------------------+-----+------------+------------------------------+---------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ | default | xc5vlx20tff323-2* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ | default | xc6vlx75tff784-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | +---------+-------------------+-----+------------+------------------------------+---------------+--------------+

  30. Result Filesreport_execution_time.txt Synthesis Time - Time of Synthesis Implementation Time - Time of Implementation Elapsed Time - Total Time xilinx : spartan3 +---------+-----------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-----------------+-----+----------------+---------------------+--------------+ | default | xc3s200ft256-5* | 1 | 0d 0h:0m:12s | 0d 0h:0m:36s | 0d 0h:0m:48s | +---------+-----------------+-----+----------------+---------------------+--------------+ xilinx : spartan6 +---------+------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+------------------+-----+----------------+---------------------+--------------+ | default | xc6slx9csg324-3* | 1 | 0d 0h:0m:21s | 0d 0h:1m:13s | 0d 0h:1m:34s | +---------+------------------+-----+----------------+---------------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-------------------+-----+----------------+---------------------+--------------+ | default | xc5vlx20tff323-2* | 1 | 0d 0h:0m:39s | 0d 0h:1m:50s | 0d 0h:2m:29s | +---------+-------------------+-----+----------------+---------------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-------------------+-----+----------------+---------------------+--------------+ | default | xc6vlx75tff784-3* | 1 | 0d 0h:0m:22s | 0d 0h:3m:22s | 0d 0h:3m:44s | +---------+-------------------+-----+----------------+---------------------+--------------+

  31. design.config.txtFunctional Simulation (1) # FUNCTIONAL_VERFICATION_MODE = <on | off> FUNCTIONAL_VERIFICATION_MODE = <off> # directory containing source files of the testbench VERIFICATION_DIR = <examples/sha256_rs/tb> # A file containing a list of testbench files in the order suitable for compilation; # low level modules first, top level entity last. # Test vector files should be located in the same directory and listed # in the same file, unless fixed path is used. Please refer to tutorial for more detail. VERIFICATION_LIST_FILE = <tb_srcs.txt> # name of testbench's top level entity TB_TOP_LEVEL_ENTITY = <sha_tb> # name of testbench's top level architecture TB_TOP_LEVEL_ARCH = <behavior>

  32. design.config.txtFunctional Simulation (2) # MAX_TIME_FUNCTIONAL_VERIFICATION = <$time $unit> # supported unit are : ps, ns, us, and ms # if blank, simulation will run until it finishes = # = no changes in signals, i.e., clock is stopped and no more inputs coming in. MAX_TIME_FUNCTIONAL_VERIFICATION = <> # Perform only verification (synthesis and implementation parameters are ignored) # VERIFICATION_ONLY = <ON | OFF> VERIFICATION_ONLY = <off>

  33. ATHENa – Database of Results

  34. ATHENa Database http://cryptography.gmu.edu/athenadb

  35. ATHENa Database – Result View • Algorithm parameters • Design parameters • Optimization target • Architecture type • Datapath width • I/O bus widths • Availability of source code • Platform • Vendor, Family, Device • Timing • Maximum clock frequency • Maximum throughput • Resource utilization • Logic blocks (Slices/LEs/ALUTs) • Multipliers/DSP units • Tools • Names & versions • Detailed options • Credits • Designers & contact information

  36. ATHENa Database – Compare Feature Matching fields in grey Non-matching fields in red and blue

  37. Already available at http://cryptography.gmu.edu/athena Similar to the database of results for hash functions Results can be entered by designers themselves. The ATHENa Option Optimization Tool supports automaticgeneration of results suitable for uploading to the database ATHENa Database of Results for Authenticated Ciphers

  38. Ordered Listing with a Single-Best (Unique) Result per Each Algorithm

  39. Possible Future Customizations • The same basic database can be customized • and adapted for other domains, such as • Digital Signal Processing • Bioinformatics • Communications • Scientific Computing, etc.

  40. Source Codes

  41. GMU Source Codes for all Round 3 SHA-3 Candidates & SHA-2 made available at the ATHENa website at: http://cryprography.gmu.edu/athena Included in this release: Basic architectures Folded architectures Unrolled architectures Each code supports two variants: with 256-bit and 512-bit output. Each source code accompanied by comprehensive hierarchical block diagrams GMU Source Codes and Block Diagrams

  42. ATHENa Result Replication Files • Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations) • Automatically created by ATHENa for all results generated using ATHENa • Stored in the ATHENa Database In the same spirit of Reproducible Research as: • J. Claerbout (Stanford University) • “Electronic documents give reproducible research a new meaning,” • in Proc. 62nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992, • http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92 . . . . . • Patrick Vandewalle1, Jelena Kovacevic2, and Martin Vetterli1 (1EPFL, 2CMU) • Reproducible research in signal processing - what, why, and how. • IEEE Signal Processing Magazine, May 2009. http://rr.epfl.ch/17/

  43. Benchmarking Goals Facilitated by ATHENa Comparing multiple: cryptographic algorithms hardware architectures or implementationsof the same cryptographic algorithm hardware platforms from the point of view of their suitability for the implementation of a given algorithm,(e.g., choice of an FPGA device or FPGA board) tools and languagesin terms of qualityof results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v. 13.1 vs. ISE v. 14.7)

More Related