170 likes | 305 Views
Toward Energy-Aware Software-Based Fault Tolerance in Real-Time Systems. Osman S. Unsal, Israel Koren, C. Mani Krishna Architecture and Real-Time Systems Laboratory Department of Electrical and Computer Engineering University of Massachusetts, Amherst. The Problem.
E N D
Toward Energy-Aware Software-Based Fault Tolerance in Real-Time Systems Osman S. Unsal, Israel Koren, C. Mani Krishna Architecture and Real-Time Systems Laboratory Department of Electrical and Computer Engineering University of Massachusetts, Amherst
The Problem • Real-Time (RT) systems are energy and thermal constrained. • Many RT applications run on battery-powered platforms. • RT systems require small form factor. • Fault-Tolerance (FT) is an important design parameter in RT systems. • Many RT applications are life-critical. • Many RT systems operate in hostile (industrial, space) environments. • FT ensures error-free operation in the face of faults.
Fault-Tolerance in RT Systems • Hardware based fault tolerance • Massive redundancy (duplex, TMR) • Requires additional hardware for error checking mechanism • Very power-inefficient • Software based fault tolerance • Application-Level Fault Tolerance (ALFT), an amalgam of time and software redundancy
ALFT Characteristics • Tasks have a primary and secondary copy • Secondaries might be exact copy of primaries, or they could be scaled-down • Resolution reduction • Precision reduction • A secondary task may be aborted if primary successfully finishes execution
The System Model • Distributed RT System • Tasks are periodic, have deadlines • Each primary has one secondary • Primary and Secondaries assigned to separate processors • Concentrating on scheduling, compare w.r.t. EDF • Tasks with random periods, execution-time • Six processor configuration
Energy Model • The more a task executes, the more the energy consumed. • Assumed to linearly scale with the increase in task execution • Appropriate for COTS processors
A Simple Energy Saving Heuristic : Shortest Execution-Time First (SEF) Relative Energy Consumption
Case study: Asymmetric Digital Subscriber Line Modem Application
Effect of Task Granularity on Energy Savings (Secondary Size 80%)
Effect of Task Granularity on Overlap Reduction (Secondary Size 80%)
Summary • An initial analysis into energy-efficiency of various fault-tolerance mechanisms has been made • Power-aware scheduling heuristics for ALFT schemes developed • Current activity: • On-line scheduling heuristics • Power-aware DVS for FT systems