240 likes | 324 Views
A Defect Tolerant and Performance Tunable Gate Architecture for End-of-Roadmap CMOS. Adit D. Singh Electrical and Computer Engineering, Auburn University AL 36849. National Science Foundation CNS 0708962 and CCF0811454. Motivation. No visibility on technology beyond CMOS
E N D
A Defect Tolerant and Performance Tunable Gate Architecture for End-of-Roadmap CMOS Adit D. Singh Electrical and Computer Engineering, Auburn University AL 36849 National Science Foundation CNS 0708962 and CCF0811454
Motivation • No visibility on technology beyond CMOS • CMOS appears here to stay! • Scaling projected to continue • At least a decade of design likely in nano-scale “End-of-Roadmap” CMOS
End-of-Roadmap CMOS Characterized by • Atomic scale feature sizes (~100 Si atoms in 45nm) • Physical limits in material properties • Random dopant fluctuations • Extreme sub-wavelength lithography Potential for • High manufacturing defectivity • Operational wear out & degradation • Highy random performance variation
End-of-Roadmap CMOS Characterized by • Atomic scale feature sizes • Physical limits in material properties • Random dopant fluctuations • Extreme sub-wavelength lithography Potential for • High manufacturing defectivity • Operational wear out & degradation • Highly random performance variation Performance Tuning Defect Tolerance
Clock Speed and Parameter Variation • Clock rate determined by slowest path • Manufacturing variability forces different clock rates: “Speed Binning” • Speed Binning works for systematic variability • Less effective for random variability Comb. Logic FF FF FF FF Clock
Speed Binning Traditional Systematic Variability • Device parameters track within a chip • All gates slow or all fast • Some chips slow, some fast • Average clock rate over many manufactured parts = clock rate for average parameter values FF FF FF FF Clock
Random Variability • Random parameter variability within chip • Every copy of a large circuit highly likely to have a few very slow paths • Average clock frequency << clock rate for average parameter values (for large circuits) FF FF FF FF Clock
Random Variability Statistical: 1 in 100 very slow gate 18 gate design 150 gate design A few slow parts Virtually all slow parts
Random Variability: Speed Vs Size • Large circuits statistically more likely to have one or more slow outlier paths Average worst case path delay FF 1 10 100 1000 10000 (log scale) Circuit Size FF FF FF Clock
Post Manufacture Performance Tuning“Delay Fault Tolerance” • Capability to allow speed-up of statistically slow outlier paths Average worst case path delay GOAL FF 1 10 100 1000 10000 (log scale) Circuit Size FF FF FF Clock
Defect Tolerant & Tunable CMOS Programmable Switch Sized and Programmable P-Net N-Net Sized and Programmable Programmable Switch
Defect Free Operation Sized and Programmable ON OFF P-Net N-Net Sized and Programmable OFF ON Traditional CMOS
Defect in P-Net OFF ON • Pseudo nMOS • operation • Pull-up sized for • ratio logic • Rpu ~ 4 Rpd P-Net N-Net OFF ON
Defect in N-Net ON OFF • Pseudo PMOS • operation • Pull-down sized • for ratio logic • Rpd ~ 4 Rpu P-Net N-Net ON OFF
Performance Tuning: Slow P-transistor • Redundant PMOS • speeds up rising • transitions • Speed up greatest • for very slow • outlier transistor • Some slow down • of opposite falling • transitions ON ON P-Net N-Net OFF ON
Performance Tuning: Slow P-transistor • Assume nominal • Rpu = Rpd • and R_tuning = 4 Rpd • Defective Extra Delay • Rpu Untuned Tuned • 1.5X 0.5X 0.10X • 2X 1X 0.33X • 4X 3X 1.00X • 6X 5X 1.40X • 8X 7X 2.20X • 1X 0X - 0.2X ON ON P-Net N-Net OFF ON Speedup
Performance Tuning: Slow P-transistor • Assume nominal • Rpu = Rpd • and R_tuning = 4 Rpd • Defective Extra Delay • Rpu Untuned Tuned • 1.5X 0.5X 0.10X • 2X 1X 0.33X • 4X 3X 1.00X • 6X 5X 1.40X • 8X 7X 2.20X • 1X 0X - 0.2X ON ON P-Net N-Net OFF ON Speedup
Performance Tuning: Slow P-transistor • Defective Extra Delay • Rpu Untuned Tuned • 1.5X 0.5X 0.10X • 2X 1X 0.33X • 4X 3X 1.00X • 6X 5X 1.40X • 8X 7X 2.20X • 1X 0X - 0.2X Assume 10 level path Untuned delay = 13X Tuned Delay = 11 X Tuning 2 additional gates: Tuned Delay = 10.6X Speedup
Simulation Experiments • Simplified simulation of inverter chains • Transistor parameters drawn from a Normal Distribution - different variance values • Circuit size measured by number of chains • For each “circuit” worst case untuned and tuned delays obtained.
Post Manufacture Performance Tuning“Delay Fault Tolerance” • Simulate and average over a large number of instances for each “circuit” size Average worst case path delay GOAL FF 1 10 100 1000 10000 (log scale) Circuit Size FF FF FF Clock
Observed Delay Variations Tuned and Untuned 8 stage inverter chains x 10-10 2.7 2.6 20% 2.5 Delay (sec) 2.4 2.3 2.2 2.1 2.0 1.9 105 102 101 103 104 Standard Deviation = 1/6 mean
Observed Delay Variations for different sigmas x 10-10 10 9 8 7 6 Delay (sec) 5 4 3 2 1 1 104 101 102 103 Number of circuits(log scale)
Conclusion Defect Tolerant & Tunable CMOS Gate Programmable Switch Sized and Programmable P-Net End-of-Roadmap Applications N-Net Sized and Programmable Programmable Switch