260 likes | 407 Views
Adaptive Computing. Systems. 1998 Military and Aerospace Applications of Programmable Devices and Technologies Conference. Dr. José L. Muñoz DARPA/ITO Sept 1998. Situation Today. Architecture is already provided… software must do the best it can within those given constraints.
E N D
Adaptive Computing Systems 1998 Military and Aerospace Applications of Programmable Devices and Technologies Conference Dr. José L. Muñoz DARPA/ITO Sept 1998
Situation Today... Architecture is already provided… software must do the best it can within those given constraints
Performance benefits of hardware….. … Flexibility of software
June 1997:“Computers that modify their hardware circuits as they operate are opening a new era in computer design.” December 1997:“Now comes the tantalizing possibility of combining speed and versatility on a single chip.”
ACS: Vision Benefits • Commodity technology dynamically specialized • Life cycle performance upgrades • Adapt to new threats • Extend mission capabilities Application Sea of Gates Place & Route Instantly “wired” to application Application-Enabled Configurable Computer
ASICs Performance benefits of hardware…. Flexibility of software General Purpose Processors ACS Technology Performance/Flexibility high Performance low Flexibility less more
Adaptive/Reconfigurable Computing-DoD Applications & Benefits Application Specific High Performance Processing C4I IR Missile Warning IR Search & Track FLIR Electronic Warfare CNI Auto Tgt Recognition Simulation Accelerator Design Process Impact Rapid Prototyping ASIC Emulation Simulation Acceleration Adaptive Algorithm Processing Aerospace Impact High-Performance Processing Flexible Interfaces Fault Tolerance Legacy System Emulation & Upgrade Multi-Mode Applications Adaptive Processing Benefits More Operational Flexibility Easier Upgrade Path Shortened Design Cycles Reduced Size, Power & Weight Reduce Parts Obsolescence Lower Life Cycle Cost Fewer Spares
Thru 2 Block Update Cycles Cost for ASIC Development is 3 times that for FPGA based Solution. --- Doesn’t Include Cost for HARDWARE RETROFIT Required for ASIC based Solution !!! Cumulative Development Cost 4 3 2 1 0 ASIC FPGA Technology Cumulative Development Schedule for FPGA is 2 1/2 times shorter than that required for ASIC based technology solution. c Cumulative Schedule 40 30 20 10 0 ASIC FPGA Technology
Goals Performance benefits of Hardware with the flexibility of Software Sample ACS Challenge problem: ATR/ 1 cu.ft. 500X better 100X - 1000X Performance improvement over micro-processor based systems Defense testbeds: ACS Challenge Problems Temporal re-use: Dynamic adaptation at runtime Power/area efficiency Domain specific development environments
Motivation Speedup Source: RAW benchmark suite http://cag-www.lcs.mit.edu/raw * - Brigham Young University 1 10 100 DES Encryption (4/96) Integer FFT (4/32) Integer Matrix Mult (4/16) Sort (15/256) Shortest Path (16/256) Genetic Algorithm (TSP) *
ACS: Software in Hardware . . . y = ax2 + bx + c C: y = a*x*x + b*x + c; load r1, x load r2, x -- x*x mult r1, r2 load r2, a mult r1, r2 -- a*x*x in r1 load r2, x load r3, b mult r2, r3 -- b*x in r2 add r1, r2 -- a*x*x + b*x in r1 load r2, c add r1, r2 -- y in r1 * * b x c 2 x bx * a + 2 x a bx + c What happens if x, a, b and c are presented as 3-bit compacted data? + 2 ax + bx + c
Test Image Sample ACS Benefit: ATR Template Matching Bright Template Surround Template Template A Template B Zone 1 Common to A/B Zone 2 Unique to Template A Zone 3 Unique to Template B UCLA PCI/Myrinet board Template additions: Template A = Zone 1 + Zone 2 Template B = Zone 1 + Zone 3
Partitioning Stage Adder Tree Creation Group Rotations into Logical Configurations Exploit Pixel Overlap to Design Adder Tree Template A Template B Domain Specific Environmentfor Template Matching Surround Template Bright Template Sets of 4 rotations 72 rotations Load Templates and Information Adder Tree Create VHDL for Dynamic FPGA
Sample Benefit:Variable Precision UCLA Mojave Surround Template Bright Template Images Correlation dependence on number of bitplanes used in calculation) 3 4 2 1 7 5 Benefits: Less area, reduced power 20% speedup in computation time/configuration 33% increase in throughput AT NO LOSS IN CORRELATION PEAK DETECTION!!
DSP Chips Only DSP With Micro- Accel Next Gen DSP Chips Only Next Gen DSP With Micro- Accel Current DSP Device Technology 1999 DSP Device Technology SHARC Chips Micro-Accelerators 603 0 46 8 Beamforming with STAP 64-bit Floating Point Arithmetic HH SHARC Chips Micro-Accelerators 242 0 18 8 * Sample ACS Benefit:Custom Precision Arithmetic Beam Forming with No STAP Processing 10 Targets Equally Spaced in Doppler Main Beam Clutter Obscures Center Targets Beamforming with STAP* 8-bit Mantissa Floating Point Arithmetic Beamforming with STAP* 4-bit Mantissa Floating Point Arithmetic * 32 Range Gates Used for Weight Generation
* C B Configuration A Selector * * MA * + MA + Key challenge:Reconfiguration Time A X B C X A B C Reconfiguration times taking 75% of processing time. Current reconfiguration times measured in msec are unacceptable for many applications… need nsec… 3 ORDERS OF MAGNITUDE IMPROVEMENT
ASICs uproc/DSPs { { { Configurable Logic (future, if not addressed) 4 orders of magnitude speedups are required in this area Key Challenge:Compilation times • From a high level language description to a working implementation • includes “place and route” times 1 hour 1 month 1 day 1 year seconds 100 101 102 103 104 105 106 107 { Configurable Logic (today)
Khoros/MatLab Application (U. Tenn, NWU, Colo St USC-ISI.) Estimator (Princeton U Honeywell.) C Synopsis, Princeton Precompiled ACS components (TSI Telesys, Annapolis MicroSystems) Compiler (Rice, U. Cinn, ..) Mapper (Rice, U. Tenn.,U. Cinn.) The ACS Challenge VHDL/Verilog Place & Route (National Semi-Conductor, Gatefield, Uwash, Lucent.) FPGA CPU FPGA
Insertion Opportunities Sonar Beamforming Ultra Wide- Band Coherent RF NUWC IR ATR NVL LANL Sandia ACS Researchers UCLA BYU SAR/ ATR Sandia Multi- dimensional Image Processing LANL ISI Component Developers Lockheed Martin SLAAC Developers Electronic Counter- measures Challenge Problem Owners Applications
DSP ACS Sensor Device Device Control Control Processor Processor Network Host Control Control Processor Processor ACS ACS Device Device SLAAC Architecture Host Control Network Interface Processor Processor Network Interface Processor Myricom L5 Baseboard SLAAC1 Board Myricom L4 Orca Board UCLA Board
* SAR Area Coverage Rate (sqnm / day @1 ft Res.) 1000 40,000 40X (FOA, Number of Target Classes 6 30 5X (Indexer, Ident.) Level / Difficulty of CC&D Low High 100X (Indexer) 10X (Ident.) Surveillance Challenge Problem Problem - SAR / ATR 40,000 sqnm / day @ 1 ft. Resolution * Corresponds to a data rate of 40 Megapixels / sec System Parameter Current Challenge Scale Factor Indexer, Ident.) 4 Orders of Magnitude problem scale… in 1 Order of Magnitude smaller volume!!
Circa 2000 . . . Network Enabled Surveillance Challenge Circa 1998 Circa 1997 ORCA based two-level board • Network based FPGA accelerator • LaNai L4/Lucent Orca 40K • 6U VME (4 nodes/slot) • Implementing Sandia indexing alg. • 1300 - 1600 matches/sec/node • (Motorola PowerPC@200 MHz can • do 700 matches/sec) 7 cu. ft. 140,000 cu. ft!! JSTARS SAR ATR Processor
ACS Challenge Problems • Surveillance Challenge Problem (Sandia National Lab) • IR Automatic Target Recognition: Tank Application (Night Vision Lab) • Sonar Adaptive Beamforming (Naval Undersea Warfare Center) • INFOSEC Separation Challenge (National Security Agency) • INFOSEC Architectures for Security (NSA) • Video: Face Recognition (NSA) • Video: Text Recognition (NSA) • Fault-tolerant/Low-power Applications (JPL) • RF Transient Signal Analysis (Los Alamos National Lab) • Plume Detection and Laser Spectral Analysis (LANL) www.ito.arpa.mil/ResearchAreas96/AdaptiveComputingSys.html
KEY TECHNOLOGICAL CHALLENGES Building Blocks Inadequate Verification Technology Immature/Risky Tough ToProgram Minimal Runtime Support ProblemFormulation
Roadmap Variable precision arithmetic Defense testbeds Runtime adaption M gates/chip Multimode adaptive radio ATR in 1 cu. ft. Point-of-use encryption Fault tolerance Sonar: adaptive beamforming Challenge problems FY97 FY98 FY99 FY00 FY01 20X ATR • IR-ATR demo • ACS bnchmrk • ATR/cu. ft. • Sonar proc • 100X ATR • ACS est tool Assessment & Ease of Use • STAP kernel • Reconfg mdl • JPEG demo • HW obj lib • Khoros • MatLab • Var prec lib • C cmplr-SUIF • Functional Prg Env • Image alg proto • Speculative exe ACS Software • NAPA 1000 • DSP/fpga bd • fpga/mem/risc • Heterog sys Granularity Hybrid • GP/fpga Fine • uscale FT chip • 400K/chip • 1.6M gate/bd 1 M gate chip 1st 400K
Performance benefits of hardware….. … Flexibility of software