150 likes | 349 Views
Benchmarking Tools and Assessment Environment for Configurable Computing. MAPLD ‘98 September 15-16, 1998. Sanjaya Kumar, Subburajan Ponnuswamy, Chirag Nanavati, John Golusky, Mark Vojta, Luiz Pires. E-mail: skumar@htc.honeywell.com Honeywell Technology Center 3660 Technology Drive
E N D
Benchmarking Tools and Assessment Environment for Configurable Computing MAPLD ‘98 September 15-16, 1998 Sanjaya Kumar, Subburajan Ponnuswamy, Chirag Nanavati, John Golusky, Mark Vojta, Luiz Pires E-mail: skumar@htc.honeywell.com Honeywell Technology Center 3660 Technology Drive Minneapolis, MN 55418
Program Overview • Provides a publicly available suite of benchmarks for evaluating configurable computing infrastructures, both tools and architectures • Addresses benchmark specification, procedures, metrics, and wide availability • Extends benchmarking technology to configurable computing • Benchmarks are being implemented on a configurable computing platform • Six benchmarks have been developed, four more being planned
Unique Aspects • Utilizes stressmarks to supplement existing functional benchmarks • First effort to specify and measure a set of characteristics relevant to configurable computing systems • Addresses a broad range of configurable architectures, beyond just single FPGAs • Approach is based on an unbiased and technology independent benchmark specification methodology • Provides a better understanding of how configurable computing systems relate to others within the design space Scalability Interfacing Capacity Others Versatility Timing Sensitivity
Original Image Wavelet Transform Quantization Run-Length Encoding Entropy Coding Compressed Image Versatility Stressmark • Measures an infrastructure’s ability to perform a sequence of distinct, computational steps efficiently • A space-time trade-off stressmark, possibly employing run-time reconfiguration • Based on a wavelet image compression algorithm • Minimum QoS must be maintained (PSNR and compressed bit rate) • Metrics include • Total elapsed time • Reconfiguration overhead • FPGA area utilized
. . 0 0 0 1 0 . . . . C D G B B A . . 0 A 0 B 1 0 0 C 1 D 1 0 E 1 0 F 1 G 1 Char Code Length 2 3 4 4 2 3 3 00 010 0110 0111 10 110 111 A B C D E F G Capacity Stressmark • Measures an architecture’s usable capacity using a Huffman compression algorithm • Alphabet defined with K characters, each with a frequency f of occurrence • Huffman compression tree (at top left) and look-up table (bottom left), based on tree, constructed using software • Objective is to implement the largest look-up table possible • Metrics include largest table size and look-up speed; three different approaches • Standard VHDL/automatic P&R • Standard VHDL/manual P&R • Custom
Timing Sensitivity Stressmark • Measures an infrastructure’s ability to implement a “time-critical” computation • Based on (COordinate Rotation DIgital Computer) CORDIC algorithm for vector rotation • Pipelined stages stress both architecture and CAD tools (place and route is an important issue) • Metrics include latency, throughput, and area utilized; two different approaches • Automatic P&R • Manual P&R CORDIC 1 MEMORY Pre-Rotate Read CORDIC 2 Scale Write CORDIC 12
Detect Anomalies SAR Frame Compute Local Stats Erode Dilate Determine Centroids Cluster Centroids Interfacing Stressmark • Measures an infrastructure’s ability to implement an application on a platform consisting of GPPs, ASPs, and FPGAs • Based on RT Parallel Benchmark Suite’s Constant False Alarm Rate (CFAR) kernel • CAD tools include those that perform hardware/software partitioning and mapping • Metrics include total elapsed time, communication time, and speedup due to configurable elements Hw/Sw Partitioning/Mapping CAD Tools ASP GPP FPGA
Scalability Stressmark Application • Measures an infrastructure’s ability to implement an application on a multi-device platform • Based on Fast Fourier Transform (complex, fixed-point) • CAD tools include those that perform partitioning and mapping (P&M) • Metrics include total elapsed time, speedup, efficiency, and area utilized; possible approaches • Automatic vs manual P&M • Automatic vs manual P&R Partitioning/Mapping CAD Tools Multi-device Platform
a b c b d f c a c CAD Benchmark • Measures the ability of an infrastructure to implement a time-consuming CAD application • Based on Boolean satisfiability (SAT) • Commonly used for automatic test pattern generation and logic synthesis/verification • Search intensive problem • Benchmark developed in conjunction with Princeton University • Metrics include total elapsed time and area utilized, three different approaches • Standard automatic • Standard manual • Custom
Tools and Platform Information • Synopsys Design Compiler on a SUN Ultra-SPARC running Solaris 2.6 • Xilinx XACT and M1 place and route tools on a PC containing a 166 MHz Pentium running Windows 95 • Annapolis Micro Systems WILDFORCE Board (1 Xilinx 4025, 4 Xilinx 4013 devices, 8 MBytes of memory) • Preliminary results for many of the benchmarks tabulated, being reviewed by Dr. José Muñoz (DARPA PM), and further refined
Versatility Implementation - 2D Wavelet Atmel 6010 SUN UltraSPARC SUN UltraSPARC Exec Time 226 ms 45 ms 97 ms 333 MHz Clock Freq 167 MHz 16 MHz NA NA 62% Utilization • Atmel results obtained from Honeywell SASSO (Gary Gardner) as part of NASA’s RHrFPGA program • Several others within ACS community implementing the versatility stressmark, results not available at this time
Plans for Existing ACS Benchmark Suite • Update benchmarks as needed (benchmark specification documents, “C” code, VHDL code, miscellaneous fixes) • Work with AFRL to provide a mechanism for submitting results through the DARPA/AFRL Benchmarking web page (www.rl.af.mil/programs/hpcbench), Ralph Kohler - POC • Address any suggestions that you may have to improve the benchmarks/web site
Additional Benchmarks • BM #1:Micro-kernel benchmark, based on discrete cosine transform (DCT), working with Herman Schmit/Seth Goldstein (CMU) • BM #2:INFOSEC benchmark, based on SHA-1, discussing with Alan Hunsberger (NSA), Burt Kalisky (RSA Labs), and Anant Agarwal (MIT) • BM #3:Data dependent computations benchmark, based on electronic counter-measures application, discussing with Rick Pancoast (Lockheed Martin) • BM #4:Variable precision arithmetic benchmark, discussing with Rajeev Jain (UCLA) and Phillip Duncan (Angeles Design Systems) • Seems to be a continuing need to develop benchmarks corresponding to “system level” applications • Will be implemented on an Annapolis Micro Systems STARFIRE, PCI-based board utilizing Virtex FPGAs
Benchmark Analysis Summary Document BM #1 Complete BM #3 Complete Aug 00 Aug 99 May 00 Feb 00 May 99 Nov 99 Feb 99 Final Report BM #2 Complete BM #4 Complete Schedule
Summary • Preliminary implementation results tabulated (being reviewed) • Benchmarks can be downloaded: www.rl.af.mil/programs/hpcbench • Deliverables • Benchmarking methodology document • Benchmark specification documents • “C” and VHDL code • Four additional benchmarks being developed • For more information, visit: www.htc.honeywell.com/projects/acsbench DARPA/NASA Design/Evaluation Tool Relating ACS Architectures to Others ACS Benchmarking Technology Trade-off and Selection Technology Demonstrating Advantages of ACS Technology App Developers & Procurement Agencies Developers of Embedded HPC Technologies