190 likes | 274 Views
A 16-Bit Low-Power Microcontroller with Monolithic MEMS-LC Clocking. Eric D. Marsman 1 , Robert M. Senger 1 , Michael S. McCorquodale 2 , Matthew R. Guthaus 1 , Rajiv A. Ravindran 1 , Ganesh S. Dasika 1 , Scott A. Mahlke 1 , Richard B. Brown 3
E N D
A 16-Bit Low-Power Microcontroller with Monolithic MEMS-LC Clocking Eric D. Marsman1, Robert M. Senger1, Michael S. McCorquodale2, Matthew R. Guthaus1, Rajiv A. Ravindran1, Ganesh S. Dasika1, Scott A. Mahlke1, Richard B. Brown3 1University of Michigan, 2Mobius Microsystems, 3University of Utah IEEE International Symposium on Circuits and Systems May 23rd – May 26th, 2005, Kobe, Japan
Overview • Motivation • Microsystem Architecture • Microcontroller • Clock Generation • Dynamic Frequency Scaling (DFS) • Microsystem Measured Results • Microcontroller • Compiler Utilization • Instruction Level Power Modeling • Clock Generation • DFS • Future Directions • Conclusion
Motivation Wireless Integrated Microsystems (WIMS) Environmental Sensors Biomedical Implants Cochlear Implant Heavy Metals Deep Brain Implants m Gas Chromatograph
Motivation (cont) • Power minimization • Frequency scaling • Voltage scaling • Memory architecture • Process technology • Leakage current mitigation Commercially available cores
Microsystem Architecture • 16-bit, 3-stage pipeline • Software controlled register interface to clock generator • Peripheral communication interfaces for flexibility
Microcontroller Architecture • Primarily a Load-Store architecture • 77 instructions, 8 addressing modes • Data and address registers split into two windows • Hardware support for one level of interrupts and subroutines • Banked memory architecture with additional external memory interface • Energy/area tradeoffs compared to single 64kB bank • Low-power loop cache for commonly executed instructions 15.9% more area 69.2% less power
Monolithic Clock Generation • Complementary, cross coupled, negative-transconductance tank • Frequency trimming via modulation of tail current with vtrim • CMOS compatible • 1.056GHz oscillation frequency • Buffer amplifier removes amplitude variation
Dynamic Frequency Scaling • Fully synthesized logic, no custom design • Synchronization chain ensures glitch free output • Optional external clock input
Dynamic Frequency Scaling (cont) • Glitch suppression example
Microsystem Measured Results • TSMC 0.18mm MM/RF bulk CMOS • 3.5 million transistors • Operates up to 92MHz • 33.9mW core power consumption @ 92MHz & 1.8V • 1.4mW core power consumption @ 10MHz & 1.1V • 17.28mW MEMS clock source power consumption @ 1.8V • 740mW sleep power consumption @ 1.1V 3.54mm
Microcontroller Measured Results • Static loop cache utilization provides 4 to 20% energy savings • Vdd scaling across different frequencies allows for adjustment to program workload requirements Loop cache energy savings Power vs. Vdd across frequency ranges
WIMS C Compiler • Windowed versus non-windowed machine • 19% reduction in power consumption • 30% performance improvement • Dynamic instruction placement in 512B loop cache achieves 43% energy savings over static placement Energy savings in 64B loop cache
1 2 Fetch energy counted separately Excludes memory access energy as this is memory dependent Instruction Level Power Modeling • Divide ISA into groups of similar instructions • noops model inter-instruction pipeline switching • Account for memory access energy separately Memory access energy Energy per instruction group
Clock Generation Results • No external reference • No PLL/DLL • High frequency accuracy • Low start-up latency • Low temperature coefficient • Broad operating temperature range • Low jitter • Minimal area overhead (3% of die) • Low Power • All Si technology
MEMS Fabrication • Post processing etch using PAD cut • Suspended inductor • Varactor etch unsuccessful • No etch chemistry for MiM oxy-nitride dielectric • Use transconductance modulation instead
DFS Results • Glitch free switching • Switching latency is 5/2f0, or 37.45ns for this implementation
Future Directions • Add DSP for Cochlear Implants and other bio-medical devices • Include ring oscillator for a lower power alternative • ISA improvements to reduce compiler bottlenecks • Address register support • Separate data and address register windows • DMA instructions • Decrease sleep mode power • Explore Microsystem design in advanced technologies 3.0mm Preliminary next generation system
Conclusion • Described a highly-functional, low-power Microsystem ideally suited for remote and bio-medical applications • DFS allows on-the-fly, low-latency adaptation to workload requirements from 33.9mW @ 90MHz to 1.4mW @ 10MHz or sleep mode at 740mW • Monolithic clock reference decreases system size, cost, and power consumption compared to other techniques • Power-aware compiler takes advantage of low-power architectural features to achieve maximum power reduction
Acknowledgements • NSF ERC for WIMS • MOSIS Educational Program • Artisan Components • TSMC • Cadence • Synopsys • Mentor Graphics • Coventor