290 likes | 304 Views
Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variation and Reliability. 1 Lerong Cheng, 1 Yan Lin, 1 Lei He, and 2 Yu Cao 1 EE Department, UCLA 2 EE Department, ASU Address comments to lhe@ee.ucla.edu. Outline. Introduction
E N D
Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variationand Reliability 1Lerong Cheng, 1Yan Lin, 1Lei He, and 2Yu Cao 1EE Department, UCLA 2EE Department, ASU Address comments to lhe@ee.ucla.edu
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
Review of Previous Work • Device and architecture co-optimization • Power and delay [Cheng DAC’05] • Process variation [Wong ICCAD’05] • Soft error rate [Lin, ICCAD’07]
Limitation of Ptrace • Ptrace requires a stable SPICE model which is able to consider all process corners • SPICE model is not available at the early stage of process development • Circuit simulation for all process corners is time consuming • The accuracy of circuit simulation is not needed for quick architecture evaluation • Does not handle realistic variation • Non-Gaussian variation sources • Spatial correlation • Does not handle device aging
Reliability Device Aging Soft Error Rate Extended Ptrace (Ptrace2) Chip Level PTrace2 Input Output Transistor Electrical Characteristics Leakage Power Process parameters Dynamic Power Circuit Level Power and Delay Estimation Trace Delay Circuit Element Statistics Critical Path Structure Chip Level Power and Delay Estimation Switching Activity Process Variation Variation Analysis Power Distribution Reliability Delay Distribution
Early-Stage Circuit Modeling • ITRS MASTAR4 model [ITRS MASTAR4 2005] Inputs: Lgate Tox Nbulk Xjext W Racc T Vdd Outputs: Ioff Ion Igon Igoff Cg Cdiff
Reliability Device Aging Soft Error Rate Extended Ptrace Chip Level PTrace2 Input Output Transistor Electrical Characteristics Leakage Power Process parameters Dynamic Power Circuit Level Power and Delay Estimation Trace Delay Circuit Element Statistics Critical Path Structure Chip Level Power and Delay Estimation Switching Activity Process Variation Variation Analysis Power Distribution Reliability Delay Distribution
Circuit Level and Chip Level Power and Delay • Circuit level power and delay • Inverter • Pass transistor driven by an inverter • Chip level power and delay • Similar to the original Ptrace [Cheng DAC’05, Wong ICCAD’05]
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
Experimental Setting • 20 MCNC benchmarks • Assume all 20 MCNC benchmarks are placed in the same chip • ITRS high performance 32nm technology (HP32) • Architecture • Cluster size N=6 • LUT size K=7 • Wire segment length W=4 • Device • Vdd=1.0, 1.05, 1.1 V • Lgate=31, 32, 33 nm • Baseline ITRS HP32
Delay and Power Tradeoff • 3.1X energy span and 1.3X delay span within search space
Power and Delay Optimization • Device tuning reduces energy delay product by 29.4%
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
Experimental Setting • Variation sources • Doping density Nbulk • 3σg=5% of nominal value, 3σr=3% of nominal value • Gate channel length Lgate • 3σg=0.8nm, 3σr=0.6nm • Simulation • M=10,000 sample Monte Carlo simulation
Power and Delay Variation • Min-ED device setting significantly reduce leakage variation with a small increase of delay variation
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
NBTI and HCI • Negative-bias-temperature-instability (NBTI) effect increases the threshold voltage of PMOS [Wang DAC’06] • hot-carrier-injection (HCI) increases the threshold voltage of NMOS [Wang CICC’07] Inputs: Lgate Tox Nbulk Xjext W Racc T Vdd Outputs: ΔVth(NBTI)ΔVth(HCI)
Vth Increase Caused by NBTI and HCI • Vth increase is the most significant in the first year • Device burn-in can be applied to reduce the impact of device aging
Impact of Device Burn-in • High performance device setting is more sensitive to device aging • Device aging leads to 8.5% of delay degradation after 10 years • Device burn-in reduce delay degradation from 8.5% to 5.5% after 10 years
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
Permanent Soft Error Rate • Single-event upset (SEU) due to cosmic rays or high energy particles may affect configuration SRAMs in FPGAs and result in permanent soft error Inputs: Lgate Tox Nbulk Xjext W Racc T Vdd Outputs: SER
SER under Different Device Setting • SER for both device setting is similar
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
Impact of Device Aging on Power and Delay Variation • Device aging significantly reduces leakage variation and slightly increase delay variation
Impact of Device Aging and Process Variation on SER • Neither device aging nor process variation has significant impact on permanent SER
Outline • Introduction • Review of existing work • Process models • Concurrent development of process and architecture • Power and delay • Process variation • Concurrent development for reliability • Device aging • Permanent soft error rate (SER) • Interaction between process variation and reliability • Conclusion
Conclusion • A trace-based framework has been developed to enable concurrent process and FPGA architecture co-development • Device tuning achieves significant energy delay product reduction • Applying device burn-in reduces delay degradation from 8.5% to 5.5% within 10 years • Device aging significantly reduces leakage variation but has has almost neglegible impact on delay variation • Neither device aging nor process variation has significant impact on permanent SER