600 likes | 819 Views
Challenges and opportunities for FPGA platforms. Ivo Bolsens Xilinx Research Labs. Thanks to. Bill Carter David Eden Erich Goetting Alireza Kaviani Bernie New Cameron Patterson Steve Trimberger Tim Tuan. Overview. FPGA’s ride the tide Opportunities Challenges.
E N D
Challenges and opportunities for FPGA platforms Ivo Bolsens Xilinx Research Labs
Thanks to • Bill Carter • David Eden • Erich Goetting • Alireza Kaviani • Bernie New • Cameron Patterson • Steve Trimberger • Tim Tuan Xilinx Confidential
Overview • FPGA’s ride the tide • Opportunities • Challenges Xilinx Confidential
ASICs buck the tide, FPGAs ride the tide • Process Technology • Performance • Architecture • Cost • Flexibility • Market trends Xilinx Confidential
Tox,Gate Leakage Channel Leakage Gate Source Drain Substrate Moore’s Law A tale of two numbers : What process people don’t tell you CD CD 320nm 240nm 160nm 80nm (nm) 2.7 4.5 6.5 1.3 Tox Xilinx Confidential
Trend: Line Widths Smaller Than the Wavelength of Light 0.700 0.600 0.500 0.400 Process Geometry (micron) 0.300 0.200 0.100 - 1988 1990 1992 1994 1996 1998 2000 2002 Optical Processing Wavelength Process Geometry Xilinx Confidential
Painting a one cm line with a three cm brush… Courtesy : IBM Xilinx Confidential
Gate Oxide Polysilicon Gate Gate Oxide Silicon crystal • About 10 molecular layers of SiO2 for this 150nm example • 90nm technology is about half the thickness Xilinx Confidential
Virtex-II FPGA to Market 1-Year Earlier Cu/Low-K Xilinx is developing 90nm in 2002 SIA Roadmap Xilinx FPGA’s are ahead of the curve 350 250 180 Process Technology Feature Size (nm) 150 130 100 70 97 98 99 00 01 02 03 04 05 Year Xilinx Confidential
Where are we today 4 24 556 442 10Mb 125K 105K 340 168 3Mb 840Mb/sLVDS 3.125Gb/s MGTs Multipliers PowerPCCPUs Logic Cells Block RAM XC2V8000 = 350M tranistors XC2VP125 Xilinx Confidential
FPGAs are leading Intel’s Roadmap Source : Intel Xilinx Confidential
Gate count requirement for ASICs Source: IMS FPGAs can address very large part of the ASIC market today Xilinx Confidential
Performance requirement for ASICs Source: IMS 2000 FPGAs can address very large part of the ASIC market today Xilinx Confidential
A Decade of Progress 1000x 1000 Virtex-II (excl. Block RAM) 100x 100 Capacity Speed Price Virtex & Virtex-E (excl. Block RAM) XC4000 10x 10 Spartan 1x 1 1/91 1/92 1/93 1/94 1/95 1/96 1/97 1/98 1/99 1/00 1/01 Year Xilinx Confidential
The Cost/Volume Crossover 1000 100 ASIC Cost 10 FPGA Cost Relative Cost 1 0.1 10 100 1,000 10,000 100,000 1,000K Unit Volume Xilinx Confidential
1 pin 10,000 transistors 10,000X Are Transistors Free? Xilinx Confidential
Source: ITRS Performance Scaling Xilinx Confidential
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + l/2 2 Tconnect (nsec) .l / 3 3 V / Heat/area Localisation of storage and computing + + store store + + l l/2 2 Courtesy :IMEC Xilinx Confidential
PC Smart Things Mainframe >100 # 1 0.01 +DSP Compute Power +Communications + Ambient Intelligence Market Requirements / human A mass market for one person Post-PC Era 60 70 80 90 00 10 Xilinx Confidential
Electronics Industry Dynamics Residential Gateway (Broadband access) Satellite/Cable + Digital VCR NTSC DES ATAPI DBS DOCSIS HomePNA HomeRF HomePLUG Bluetooth Hiperlan2 DSL... Digital VCR Custom Features (Pay-Per-View) Market Size ($) Dramatic increase in new standards NTSC DES ATAPI DBS DOCSIS Cable Decoders NTSC DES ATAPI NTSC Smart cards (DES) NTSC • New Products • Take less time to reach high volumes • Shorter Product Life Cycles • Many standards / More Interoperability Time Xilinx Confidential
Design Interconnect Power Analysis 3% 20% Analysis authoring Transistor 3% Simulation 5% Extraction 5% Place and Route 17% Floorplanning 5% Static Timing Gate Simulation 7% Analysis Simulation 5% 14% Complex ASIC DesignThe Shrinking Window of Innovation Synthesis 16% • Average iterations between design and layout = 20 (Source Electronic Systems Jan 99) Xilinx Confidential
Simpler/Faster Design Flows • 2:1 proven Time-to-Market Advantage • No silicon design or verification steps • More design flexibility through later design freeze Spec Design and Verification Silicon Prototype System Integration Silicon Production ASIC Flow Design Freeze Spec Design and Verification System Integration FPGA Flow Design Freeze Xilinx Confidential
Today’s Product Lifecycle Profit for first to Market • 37% of new digital products were late to market • Entering the market first can result in up to a 40% greater total profit contribution over the product’s life vs. the #2 entrant Profit Reduced profit for latecomers Time Xilinx Confidential
Today’s Product Lifecycle IRL extends product life in market • 37% of new digital products were late to market • Entering the market first can result in up to a 40% greater total profit contribution over the product’s life vs. the #2 entrant Profit Time Xilinx Confidential
I-Cache 16KB Fetch & Decode Timers and Debug Logic D-Cache 16KB PPC PPC MMU Execution Unit 32x32b GPR ALU, MAC Virtex-II Pro PowerPC Technology • 32-bit RISC CPU, Harvard Architecture • 130nm CMOS with 1.5V Operation • 456 Dhrystone MIPS at 300MHz • 32 x 32-bit General Purpose Registers • Hardware Multiply / Divide • 5-Stage Execution Pipeline • 16KB D-Cache, 16KB I-Cache • Memory Management Unit (MMU) • High-Bandwidth Interface to Logic • Built-In Hardware Timers • Built-In JTAG Debug and Trace support IBM PowerPC™ 405 RISC CPU 3.8 sq mm = 1% of 2VP100 Xilinx Confidential
4 CPUs 2 CPUs High Performance 1824 1600 912 Dhrystone MIPS 800 456 400 220 200 1 CPU 100 AlteraExcaliburArm 9 Virtex-II ProPowerPC 405 Xilinx Confidential
“Low PowerPC”: 0.59mW/MIPS 400 Full-Custom IBM CPU Design 1.5V 130nm CMOS Technology Low-K Dielectric IP-Immersion 300 100mW = 1 LED Indicator Power (mW) 200 100 …or 169 MIPS! 150 0 50 100 200 250 300 350 400 Performance (Dhrystone MIPS) Xilinx Confidential
PPC PPC IP-ImmersionEmbed multiple IP blocks of arbitrary shape withhigh-bandwidth connectivity to FPGA core logic, memory & I/O Technologies Enabling IP-Immersion Metal 9 Metal 8 Metal 7 Metal 6 Metal 5 Metal 4 Metal 3 Advanced hard-IP block (e.g. PowerPC CPU) Metal 2 Metal 1 Poly Silicon Substrate Active Interconnect™Segmented Routing Metal ‘Headroom’ Xilinx Confidential
PPC PPC System Architecture Options ExternalDevices ExternalInterfaces • “Logic-Centric Architecture” • PowerPC Executes Entirely out of Cache • No FPGA Logic, Memory, or I/O Used • 10-20 Pages of C-Code or More • Use as Complex Algorithmic Engine • Web Server • Encryption/Decryption • Packet Processor • “CPU-Centric Architecture” • PowerPC forms Heart of Embedded System • On & Off-Chip Peripherals • External Interfaces • e.g. PCI, 3GIO, Gb Ethernet, ZBT SRAM • CoreConnect™ On-Chip Bus • Ties System Together • Peripherals implemented in FPGA Logic • Typically Runs Embedded OS ExternalDevices ExternalInterfaces Xilinx Confidential
The Virtex-II Pro Advantage Viterbi Viterbi Interleave Interleave Reed-Solomon Reed-Solomon HW acceleration Virtex-II Pro Code Stack (C++) Concatenated FEC Engine Control Tasks PowerPC Processor RAM Viterbi Inter-leaver Reed-Solomon Viterbi Interleaver Reed-Solomon PowerPC with Application-SpecificHardware Acceleration Control Tasks XTREMEProcessing™ Control Tasks Traditional Processing time Xilinx Confidential
Provides Specialized Connectivity Between PowerPC & FPGA Logic Dual-Port BlockRAM Memory CPU & Logic Each Own 1 Port High-Bandwidth 6.4Gb/sec Low-Latency Non-Caching Designed for Communications Data Processing Enables PowerPC & FPGA Logic to Work together on Complex Problems 6.4Gb/sec 6.4Gb/sec I-Cache 16KB Fetch & Decode Timers and Debug Logic D-Cache 16KB MMU Execution Unit 32x32b GPR ALU, MAC 6.4Gb/sec 6.4Gb/sec AccelerationLogic HW/SW Interfacing BlockRAMs Xilinx Confidential
APU Controller PLB 405 Core Hardware Coprocessor Processor Block APU Controller • Micro-controller style interface to fabric for control plane applications • Benefits: • Up to 10x faster than memory mapped interface • Saves PLB bandwidth for code execution • Minimizes pipeline stalls Xilinx Confidential
TCP TCP/IP Stack on PowerPC IP Link Layer inFPGA Logic (GbE MAC) MAC MAC TCP/IP Creating Complete Communications Solutions ftp telnet rlogin mail etc Upper Layerson PowerPC PHY RocketIO is PHY(1000Base-SX/LX) Gb Ethernet (1000BaseLX/SX/CX) Xilinx Confidential
Infiniband ExampleCPU Makes Communications Practical, Easier, & Cheaper InfiniBand TCA built with CPU + fabric CPU Based Solution8 Times Less Area …or built with fabric only Sources: Intel, Xilinx Xilinx Confidential
Specify System Architecture 2 1 Create System Architecture 3 4 Define Addresses Configure Peripherals Configurable Platform Xilinx Confidential
UART Interrupt Controller PPC 405 32-Bit RISC 130nm Process 300+ MHz Core 420 D MIPS MicroBlaze MicroBlaze MicroBlaze MicroBlaze PPC 405 32-Bit RISC 130nm Process 300+ MHz Core 420 D MIPS Arbiter MicroBlaze The MicroBlaze™High Performance Soft CPU tm TM CoreConnect Technology Local OPB Bus Xilinx Confidential
Incremental Designlessens the impact of design changes • “Next Generation” technology • Easy set-up through floorplanningalong HDL hierarchy boundaries • Changes only affect the modulethat was changed • The remainder of the design stays locked and intact • Timing repeatability • preserves routing • Faster turnaround for localized design changes Xilinx Confidential
Partial ReconfigurabilityFPGA Flexibility for the Field 011011 • Re-program part of an FPGAwhile it’s still running • Virtex-II and Virtex-E Fixed Logic PR Logic PR Logic Fixed Logic Fixed Logic User Definable Boundaries Xilinx Confidential
Bus Line System System Payload Payload Data Line Tx Assembly Qualify Format Coding System Payload Interfaces Processing Payload Payload Data Line Rx Buffer Quality Alignment Decoding System Exploration Xilinx Confidential
Traditional Architecture Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus Motorola PowerQUICC System U-Bus CPM RAM Memory AAL5 G704 G703 Interface Processor Framer LIU FLASH EEPROM Payload Processor Processing Other Peripherals PCI Bus MPC860 System PCI Bridge CPM = Communications Processor Module Device Generic Design System Interfaces Xilinx Confidential
Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus Motorola PowerQUICC System U-Bus CPM RAM Data Memory AAL5 G704 G703 Direction Interface Processor Framer LIU FLASH EEPROM Payload Processor Processing Other Peripherals PCI Bus MPC860 System PCI Bridge CPM = Communications Processor Module Device Generic Design System Interfaces Traditional Architecture Xilinx Confidential
Optimized Architecture Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus System RAM Dual Port MicroB G704 G703 Block Processor Framer LIU RAM FLASH EEPROM Memory PowerPC Payload Processing Interface Processor Other Peripherals PCI Bus System PCI Bridge Fast I/F Device FIFO FPGA Boundary Generic Design System Interfaces Xilinx Confidential
Optimized Architecture Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus System RAM Dual Port MicroB G704 G703 Block Processor Framer LIU RAM FLASH EEPROM Memory PowerPC Payload Processing Interface Processor Other Peripherals PCI Bus System PCI Bridge Fast I/F Device FIFO FPGA Boundary Generic Design System Interfaces Xilinx Confidential
Interconnect and power Source : Bill Daly Xilinx Confidential
Interconnect and performance Source : Bill Daly Xilinx Confidential
Power Analysis • Typical design • 5.9uW/CLB/MHz [FPGA00] • Fabric power is ~69% of total power • 2V6000 = 5.9uW/CLB/MHz 8448CLBs 100MHz 69% = 7.5W Xilinx Confidential
Dynamic Power • Normalized to 2001 • Best fit is a quadratic trend line • Predicts 5X by 2007 1996: 4000EX 1997: 4000XL1998: 4000XV1999: Virtex2000: Virtex-E2001: Virtex-II Xilinx Confidential
Static Power • Normalized to 2001 • Best fit is a power trend • Predicts 100X by 2007 • Future data points projected using linear trend for 1/VTH 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Xilinx Confidential
Static versus Dynamic Xilinx Confidential
Mixed Signal FPGA uProc. Virtex-II Pro System Clock Management Virtex High Performance I/O Virtex Memory XC4000 Special Arithmetic Functions XC4000 Gates Routing XC2000 The Age of Accumulation Xilinx Confidential