140 likes | 267 Views
Briefing: Independent NASA Test of RTSX-SU FPGAs Summary of Industry Tiger Team Results. Aerospace Actel Team Team Lead : Larry Harzstark 2-16-05. Problem Statement.
E N D
Briefing: Independent NASA Test of RTSX-SU FPGAsSummary of Industry Tiger Team Results Aerospace Actel Team Team Lead: Larry Harzstark 2-16-05
Problem Statement • Several Contractors Reported Failures of a New Generation (RT54SX-S/A54SX-A) of Actel FPGAs Subsequent to Successful Programming • These Devices Are Ubiquitous and Are Used by Nearly Every Program (SMC, NRO, NASA, …) With Recently Built Hardware • Rapid Identification of Root-cause and Remediation Needed to Keep Programs on Schedule
Planned Activity • Collect Data on FPGA Failure Conditions and Successful Operating Conditions • Determine root cause of failures • Perform Destructive Physical Analysis (DPA) of Normal and Failed Devices • Gain Physical Understanding of Failure Mechanism • Establish a Better Reliability Database and Determine Effect of Electrical Stress on Reliability • Screen Out Infant Mortality • Understand Long Term Reliability • Develop a potential screen (if possible) • Assist Programs in Assessment of Fly-as-is Risks
Phase 1: MEC Old Algorithm • Original Hypothesis • Failures Caused by Users Operating Parts Outside of Actel Specification • Small Sub-population of Antifuses Fail Quickly Under Electrical Overstress • All Contractors Were Operating Devices With Out-of-spec Transients • All Reported Failures Occurred in First 200 Hours • Aerospace-led Tiger Team Developed a Test Plan • Perceptive Test Vehicle for Antifuse Degradation • Measure Failure Rate With Increasing Electrical Overstress • NRO Program Sought Early Results and Sponsored Subset of Tests - Testing Began 27 May • All testing performed by Actel • Aerospace Developed New DPA Techniques to Understand Failure Mechanism
Colonel’s MEC Test Plan Old-Algorithm, Started 27 May SSU-Stressed Stress-free 4B1 @ 25 MHz 17 I/Os switching 12.5% I/O toggle rate -1V undershoot 4B2 @ 50 MHz 70 I/O’s switching; 50% toggle rate -2V undershoot Project 4B1 550 parts Project 7 600 hrs Project 4B2 2000 hrs Project 7 – 30 failures observed in 600 hours Project 4B1 – 5 failures observed in 2000 hours Project 4B2 – 3 failures observed in 2000 hours Results proved original hypothesis of user overstress as the cause of failures not valid
Colonel’s Project 4 Failures • Failures consistent with continuation of Project 7 Weibull curve • No evidence that stress (SSU) caused any additional failures * After 600 Hours in Project 7
Phase 2: New Algorithm • Actel Released Modified Programming Algorithm on 19 May to Eliminate the Early Infant Mortality Failures • Actel Believed Old Algorithm allowed too much power to be delivered to antifuse during the initial breakdown phase of programming • Actel Had Performed Experiments Using Their Standard Qualification Test to Compare Old and New Algorithms using A54SX72A (2X dynamic antifuse over industry RT54SX32S testing) • Old Algorithm: 16 Failures in 623 Devices in 168 Hours • New Algorithm: 0 Failures in 705 Devices in 1000 Hrs • Actel Subsequently Used More Perceptive Project 7 Test (A54SX72A) • 5 Failures in 231 New-algorithm Devices in 124 Hours • All Low-current Type Antifuses • Tiger-team Tests Observed Failures in Both High-current and Low-current Antifuses • DPAs Showed Same Characteristics As Old-algorithm Failures • Lower Failure Rate Than Old-algorithm, but Still Unacceptable MEC New algorithm not recommended by Aerospace due to high failure rates
Tiger-Team MEC Parts Testing Started 31 Aug • 4B2 Old Algo • 25º C • 4B2 Old Algo • 85º C Vcca=3.0 500 parts • 4B2 New Algo • 25º C Electrical test points at 0, 1, 25, 49,168, 500,1000,1500, … hours
Tiger Team Project 4 Test Failures New algorithm did not eliminate failures
Statistical Analysis **Note: estimates for the old algorithm are based on the Tiger Team Project 4 tests (which has a much smaller sample size than the Colonel’s test) and could be as large as 8.8% according to the Colonel’s data
Phase 3: UMC Foundry • Actel Announced UMC Version of Radiation Tolerant FPGAs on 19 Jul • Several Years of Successful UMC Experience With Commercial FPGA Designs • Millions of Parts Shipped • Received MIL-Q Certification From DSCC on 1 Sep • Over 1.3M Device-hours Without Antifuse Failure Using Actel Qualification Test Vehicle • Longest Test Time Was 2000 Hours • Actel Has Tested With No Failures Using Tiger-team P7 and P4 Test Vehicles (see latest Actel data) • Over 250K Device-hours on 666 Devices • Longest Test Time Is 1268 Hours on 100 Devices UMC test results very promising
The Way Ahead • Most SMC and NRO Programs Have Switched to UMC FPGAs or ASICs • A few experimental programs still evaluating options • Three Test Programs for UMC FPGAs in Process • NASA Tests – Rich Katz to discuss • Aerospace Life test on 240 Commercial UMC Parts • Space Qualification Program