440 likes | 462 Views
Delve into the FPGA market in 2005 and future projections. Explore power consumption, packaging, design methodology, and hard cores development in just 20 minutes. Understand the landscape dominated by Xilinx and Altera, with insights on innovation and market maturation. Discover the growth potentials, challenges, and opportunities in the booming $41 billion market.
E N D
FPGAs in 2005 and Beyond… Peter Alfke Xilinx, Inc. August 2005
Agenda The Market Reducing Power Consumption Packaging for Signal Integrity Fast and Efficient Hard Cores Design Methodology The Future and all this in 20 Minutes…
The Programmable MarketplaceQ1 Calendar Year 2005 PLD Segment FPGA Sub-Segment Lattice Xilinx QuickLogic: 2% Actel Other: 2% 7% 5% 58% 33% 51% 31% 11% Xilinx Altera Altera All Others Two dominant suppliers, indicating a maturing market Source: Company reports Latest information available; computed on a 4-quarter rolling basis
17% 18% 20% 24% 28% 32% 39% PLD Market Share $2.1B $2.6B $4.1B $2.6B $2.3B $2.6B $3.1B 100% 32% 31% 33% 32% 32% 34% 31% 80% 60% Market Share (%) 40% 51% 50% 49% 44% 38% 35% 20% 30% 0% Calendar year 1998 1999 2000 2001 2002 2003 2004 Xilinx Altera All Others Source: Gartner Dataquest
A Maturing Market • Dominated by two players, Xilinx and Altera • With 51% and 32% share = 83% combined • Remaining players scramble for niches • All non-dedicated players have given up: • Intel, T.I., Motorola, NSC, AMD, Cypress, Philips… • Late-comers have been absorbed or failed: • Dynachip, PlusLogic, Triscend, SiliconSpice (absorbed) Chameleon, Quicksilver, Morphics, Adaptive Silicon (failed) The pace of innovation is set by the leaders
Rapidly Rising Cost of IC Design… 26 Prototype 24 22 20 Validation 18 16 14 Physical Cost ($M) 12 10 8 Verification 6 4 2 Architecture 0 Feature 0.35µm 0.25µm 0.18µm 0.13µm 90nm Dimension (80M) (2M) (40M) (5M) (20M) (Transistor Count) …makes most ASIC solutions too expensive Source: IBS: MS-FSA3.35
Mainstream Requirements • Versatility, high performance, and low cost • Popular sub-functions @ ASIC performance and cost • User-friendly and capable tools • Many available cores, helpful tech support • Easy (partial) re-programmability • Signal integrity on the pc-board • Compatible I/O levels and standards • Many size, speed, temp and package options • Small-quantity part availability for fast prototyping FPGAs replace ASICs and ASSPs
Low-Priority Niches • Single-chip, non-volatile, instant-on • is left to CPLDs, usually only for small systems • One-Time Programmable (OTP) • antifuse technology, limited appeal for aerospace • Ultra-low power for battery operation • conflicts with high performance (high leakage current) • Limited security, without using encryption • Antifuses and Flash are not meant for serious protection …of marginal interest to the main players
Unlimited Growth Opportunitiesin a $41 Billion Market Embedded Processing $3B Very High Performance DSP $3B ASSPs $15B PLD Market $5B ASICs $15B Market Data Source: Forward Concepts, Gartner Dataquest Forecast Database (May’05), iSuppli Forecast Database (Mar’05); 2008 Projection
Fab-Less is a Winner… • FPGAs require leading-edge technology to overcome inefficiencies relative to ASICs and ASSPs in transistor count, delay, extra I/O flexibility • Leading-edge fabs are expensive ($2B to $3B) • and become obsolete in a few years • Fabs like TSMC and UMC are very profitable pioneering aggressive technology, spreading the risk • FPGA companies do R&D, design, test, and marketing Provides stability, fast boom-time growth, and survival in a recession !
Four Pillars of Progress • IC technology 90…65…45 nm • lower defect density, less leakage current • Chip architecture and CMOS circuitry • Software and Intellectual Property • cores • Innovative systems design / Applications • Massive parallelism in DSP, etc Much more than just “Moore’s Law”
IC Technology • 90 nm today, 65 nm 2006, 45 nm later • Moore’s Law is still alive for years to come • Low defect density achieves high yield, low cost • 3.3-V compatibility and tolerance are getting problematic • Thick oxide for I/O, thin oxide for logic fabric • Thin oxide increases performance, but also leakage • Mediumoxide ideal for config. cells, pass transistors Reducing leakage current is a major goal
Total Power Consumption • Battery-operated systems: current consumption • Concerned about operating life per battery charge • Static current usually dominant. Very temp. dependent. • 90-nm technology is not a good match for ultra-low power. • Plug-in-the-wall systems: junction temperature • Primarily concerned about heat removal, heat sinks, air flow • Dynamic current usually dominant, very voltage dependent • Sleep mode, clock gating, partial shut-down, reduce Vcc Power consumption is the leading concern at 90-nm
Explosive Growth in Leakage Technology Node (nm) 350 250 180 130 90 65 500 45 22 102 Dynamic power 100 International Technology Roadmap for Semiconductors (ITRS)2001, 2002. Courtesy: Moore’s Law Meets Static Power, Computer, December 2003, IEEE Computer Society Normalized power 10-2 10-4 Static Power (leakage) 10-6 1990 1995 2000 2005 2010 2015 2020 Year
Virtex-4 Drives Power Lower Virtex-4 Power Advantages • Significant power reduction compared to other FPGAs • Static power reduced by 40% from previous generation through innovations in process and circuit design • Less than half of competing 90nm FPGA’s leakage current • Excellent thermal package characteristics • Dynamic power reduced by 50% at a given frequency • 20% less power for designs running 50% faster • Embedded functions reduce power dramatically,by a factor of 5 to 20 • Innovative tools for system power estimation Power consumption is a major design consideration
ASMBL Using Flip-Chip • Application-Specific Modular BLock Architecture • Groups specific circuit blocks in dedicated columns • Logic, DSP, BRAM, Clocking, DCMs, I/O, MGTs, PowerPC, Configuration • I/O columns distributed throughout the device (Flip-Chip)
Three Virtex-4 Families • Application-Specific Modular Block Architecturemakes it easier to create sub-families • LX has logic, BlockRAMs, DSP-Blocks, I/O • SX has more DSP Blocks and BlockRAMs, less logic • FX adds powerful system features: • PPC, Ethernet controller, 11 Gbps transceivers Virtex-4 = eight ‘LX, three ‘SX, six ‘FX circuits 17 family members available in 2005
Covering a Wide Range Network Processing Supercomputing FX Throughput LX Scientific Processing SX Arithmetic Performance
Virtex-4 Package Improvements • Pinout Arrangement“Sparse Chevron”Tight coupling of outputs with Vcc and Ground. • Multiple power planesinside the package • Low-inductance capacitors in the package
Outputs Tightly Coupled to Distributed Vcc and Ground • 1000 fast I/Os demand attention to signal integrity • Careful line termination • reduced inductive crosstalk or ground-bounce from SSO = simultaneously switching outputs • Each I/O pin must be surrounded by Vcc/Ground • Small current return loop = low crosstalk Virtex-4 packages reduce crosstalk 83%
Vcco GND Vccint Vccaux I/O FF1513 SparseChevron™ Virtex-4™ State-of-the-Art Packaging Traditional “SparseChevron™” GND Xilinx FF1517 Package -- Virtex-II Pro
Reduced Inductance Through Better GND & Power Coupling d A = hight x distance L =proportional to A d - - + + Other FPGA h h Pwr GND Smaller Current Loop Area Leads to Reduced PCB Inductance (L)
This number is reduced to 4.9 nH with Xilinx SparseChevron Pin-out (> 3x improvement) 0.15 nH 1.16 nH 2 nH 175 um Bump Pitch o o o o o o o o o o o o o o o o o 15.6 nH 1 mm Ball Pitch Ground Plane } Signals Ground Plane Breakout of Total Mutual Inductance 82% of noise is affected by the pin-out and created in package balls and PCB vias
Dedicated Circuits in FPGAs • “Hard” cores offer density, speed, lower power • Equal to 90-nm ASICs, but far less expensive • Expandable, pipelined Multiplier/Accumulator • Dual-ported BlockRAM with FIFO controller • ChipSynch I/O serializer/ deserializer + IDELAY • Multi-Gigabit transceivers, 0.6 to 11 Gbps • PowerPC µProcessor and Ethernet controller Dedicated circuits provide a big performance boost
Multiplier/AccumulatorFull Custom Design Results in Higher Performance Scalable 500MHz performance is impossible with Standard Cell libraries and Standard Cell design flow Pipeline Registers enable 500MHz performance Integrated Cascade Routing enables scalable performance Arithmatica™ Parallel Counter 20% faster performance and uses less area 2x the Performance of Virtex-II Pro Arithmatica™ A+Adder 20% faster than any other implementation
Fast and Flexible BRAM • Enhanced architecture for 500 MHz performance, pipelined in & out • Two totally independent ports,read/write, read/read, write/write • “Read-previous-data” during write • Built-in Hamming Error Correction • 2 BlockRAMs can be combined • Optional 500 MHz FIFO logic supports fully asynchronous clocking
FIFO Controller in Each BRAM • FIFOs are ideal for crossing clock domains • but fast asynchronous design is tricky and error prone • (ugly decoding glitches and metastability problems) • Xilinx has extensive experience in FIFO design • We put a high-performance, reliable, proven asynchr. FIFO controller into each BlockRAM Tested for >1014 “going-empty” cycles @ 500 MHz
System Synchronous vs. Source Synchronous Clocking One common system-synchronous clock arrives “simultaneously” Dedicated source-synchronous clock for each datapath Clock delay is irrelevant, if it is the same as the data delay
Advanced Parallel I/O Interfaces • Universal connectivity • ChipSync™ technology • SerDes on every pin • XCITE DCI termination • Extreme performance • Up to 1 Gbps LVDS • Up to 600 Mbps single-ended • Widest set of supported standards • PCI, PCI-X, SFI-4, HSTL, SSTL, LVCMOS, LVTTL… • Support for 26 electrical standards • Vcco = 3.3 V, 2.5 V, 1.8 V,
I/O SERDES • Frequency division • Serialize/Deserialize SPI 4.2 Precision Delay • Bit/Word Align, DPA Pre-Designed Built-In SSIO Logic Pre-Designed Built-In SSIO Logic I/O Clocking DDRMemory • I/O clocks • Regional clocks • Clock-capable I/Os ChipSync™I/O on Every Pin • Pre-Engineered solution for source-synchronous interfacing • Controllable delay line • Embedded in every I/O • Key advantages • Easier design • Higher performance • Resource savings
Multi-Gigabit Transceivers • Virtex-4 RocketIO™ • Full-duplex serial transceiver blocks with integrated SERDES and Clock and Data Recovery • 622 Mbps to >10 Gbps I/O • Widest speed range • Compatible with Virtex-II Pro • Supports chip-to-chip, backplane, chip-to-optics SONET
Microprocessor & Ethernet • Ethernet Media Access Controller • EMAC with PowerPC • PowerPC runs Protocol Stacks • EMAC used to debug and control PPC or • PPC to configure and control the EMAC • EMAC without PowerPC • Any application with control state machine or soft processor in the FPGA fabric • Two EMACs per PowerPC • Redundancy • Bridging Hard Ethernet Controller saves 6000 slices PowerPC 405 EMAC EMAC Hard EMACs
Designing with FPGAs • Design architecture-specific not FPGA-generic • Most recent improvement comes from hard cores • Generic design approach sacrifices speed and cost • Pipeline, it gives a free performance boost • Avoid cycle-to-cycle timing dependencies • Use parallel structures for highest performance • Use synchronous design, global clocks Right design methodology boosts performance
2006 and Beyond… • Moore’s Law will lead us to 45 nm and lower cost • More hard cores offer performance, density, and low power consumption equal to the best ASICs • Emphasis on massive parallelism in DSP, etc. • Dynamic re-configuration, and SEU-hardening. • Innovative solutions to reduce leakage current • Plus a few surprises… Unlimited FPGA growth opportunities
FPGAs have come a long way… Thank you!
Non-Virtex-4 Virtex-4 Virtex-4 Drives Power Lower Benefits of Reduced Power • Reduced thermal concerns • Smaller no heat sinks needed • Simpler system thermal design (airflow, fans) • Easier power supply design • Smaller supply circuitry • Reduced components • Less PCB space • Lower cost power system • High-end power supplies cost from $0.50-$1.00/Watt • Higher system reliability Non-Virtex-4 Virtex-4 Reduced Need for Heat Sinks Smaller Power Supplies
XCITE Digitally Controlled Impedance • 3rd generation DCI • Series, parallel, differential termination • Temperature / voltage compensation • Fewer resistors on-board • Easier PCB design • Termination at source or load • Works in conjunction with I/O standards • Examples: HSTL, SSTL, etc. Many Selectable Options
Increased Functionality with Dramatic Power Reduction Challenges - Static power (leakage) grows exponentially with process generations - Dynamic power grows with frequency (P = cv2f) Power Consumption Virtex-4 cuts power by 50% • Measured 40% lower static power with Triple-Oxide technology 130 nm FPGAs 50% • 90-nm: 50% lower dynamic power – Lower core voltage + less capacitance • Up to 10x lower dynamic power with integrated hard IP – Fewer transistors per function Frequency
Power Distribution Improvements Planes and Decoupling Decoupling Cap pack Die Package Substrate • New on-package low-intrinsic-inductance capacitors • Low inductance capacitor connections to package power planes • Significantly improve high frequency response for package power system • Z0 integrity – more robust controlled impedance transmission lines • Better reference plane continuity and lower effective plane inductance • Improved return path quality of these planes