360 likes | 535 Views
Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems. Authors: Wanghong Yuan, Klara Narhstedt Appears in SOSP 2003 Presented by: Samuel Kim. Table of Contents. About the Authors Introduction Algorithm Implementation Results Related Works Summary.
E N D
Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems Authors: Wanghong Yuan, KlaraNarhstedt Appears in SOSP 2003 Presented by: Samuel Kim
Table of Contents • About the Authors • Introduction • Algorithm • Implementation • Results • Related Works • Summary
About the Authors • Wanghong Yuan • B.S., M.S. Peking University • Ph.D. University of Illinois at Urbana-Champaign • Software Engineer at Google • KlaraNahrstedt • Ph.D. University of Pennsylvania • Professor at University of Illinois at Urbana-Champaign
Table of Contents • About the Authors • Introduction • Algorithm • Implementation • Experimental Results • Related Works • Conclusion
Introduction • Multimedia Becoming A Standard in Mobile Computing • Audio • Video • Data • Goal on Mobile Systems • Manage System Resources • Quality of Service - High Performance • Energy Efficiency - Battery Life
Greater Control Over System Resources • Hardware Adaptability • CPU Voltage Scaling • E = a•C•V•f2•t • Software Adaptability • Application Quality levels • Statistical Performance Requirements • Soft Real-Time guarantees
How Do We Approach System Resource Management? • Adapt resources based on system layers • Most approaches in research adapt a single layer • Possible to adapt across multiple layers?
Multiple Layer Adaptation Requires Coordination • Conflict Adapting Multiple System Layers • Scale down CPU • Increase the application QoS • Different Objectives • Minimize Energy Consumption • Maximize Quality/Performance • Coordinate Objectives at a Higher Level
Application GRACE Current Approaches Network Protocols Coordinator Operating System Architecture, Hardware • Global cooperation of resources • Adaptation over 1 or 2 layers The Purpose of GRACE Framework: Cross-Layer Adaptation • Global Resource Adaptation via CoopEration Figures from S. Adve. “The Illinois GRACE Project: Global Resource Adaptation through CoopEration”, Workshop on Self-Healing Adaptive and Self-Managed Systems, 2002
GRACE-OS: Enhanced CPU Scheduler • Previous Methods • Soft Real-Time (SRT) Scheduling • Dynamic Voltage Scaling (DVS) • GRACE-OS • DVS is integrated into the CPU Scheduler • Continue to keep performance guarantees of SRT Scheduling
Table of Contents • About the Authors • Introduction • Design and Algorithm • Implementation • Experimental Results • Related Works • Conclusion
Design of GRACE-OS • Profiler • How to estimate cycle usage? • Monitor CPU cycle usage of a task • Estimate demand by online profiling • SRT Scheduler • How to allocate CPU Resources? • Allocate CPU cycles to task based on profiler • Speed Adapter • How to set CPU Speed/Voltage? • Set CPU to minimum required speed based on #cycles allocated
Algorithm: Profiler • How to estimate the cycle usage? • Estimate based on statistical distribution instead than instantaneous demand • More stability in CPU speeds • Meets performance requirements of SRT • Profile during run-time
Algorithm: SRT Scheduler • Determine which task to execute • When and how long (# CPU cycles) • Grace-OS is a stochastic scheduler • Decide # cycles to allocate based on: • Performance requirement, p • Demand distribution of task • F(C) = P[X ≤ C] ≥ p • X, # cycles required for task • C, # cycles allocated to task
Algorithm: Dynamic DVS • As cycle number increases, CPU accelerates • Minimize energy consumption • Constraint: CPU period is less than period allocated for task • Frequency a function of cycle count
Table of Contents • About the Authors • Introduction • Algorithm • Implementation • Experimental Results • Related Works • Conclusion
Implementation • Testbed • HP Pavilion N5470 Laptop (Athlon Processor) • Red Hat Linux 7.2 • Modified Linux kernel 2.4.18 (GRACE-OS) • Software Architecture of Implementation
Implementation • System calls added to support SRT tasks • start_srt – start real-time mode • exit_srt – exit real-time mode • finish_job – tell scheduler that task finished job • set_budget – allocate cycles for task • set_dvspnt – set CPU speed in task’s speed schedule • Modifying the process control block • 5 attributes
Table of Contents • About the Authors • Introduction • Algorithm • Implementation • Experimental Results • Related Works • Conclusion
System Call Overhead • System Calls: 900-1300 cycles • Multimedia Processing: 2x105 - 2x108 cycles • 0.0004% - 0.5% of cycles per job
Profiling and Estimation Overhead • Profiling Cost: 26-38 cycles • Overhead for online demand estimation is high (0.1% - 100% of cycles per job) • Demand estimation should be infrequent • Stable models allow for infrequent estimation Figure: Cost of Demand Estimation
Speed Scaling Overhead • Costs 8,000 to 16,000 cycles (~10-50 us) • Should be invoked infrequently (500 us in GRACE-OS) • Speed change overhead should improve with processor design
Stability of Demand Distribution • Codec: mpgplay • Cycle usage varies greatly • Demand distribution remains stable
Efficiency of GRACE-OS • Compare to other allocation schemes • Running Single Applications • Misses deadlines 0.3%-0.6% • 92% CPU busy time at lowest CPU speed • 53.4%-71.6% reduction in energy • Running Multiple Applications • Misses deadlines 4.9% • 83.8% CPU busy time at lowest CPU speed
CPU Usage for Multiple Applications • Dynamic DVS spends more time in lowest CPU speed than other DVS schemes
Energy Efficiency of GRACE-OS • toast and madplay – Low CPU demand • GRACE-OS savings limited by CPU settings
Impact of Setting Performance p • Normalized energy increases p = 0.5 to p = 0.95 • Fewer energy savings p = 0.95 to p = 1.0 • Need more CPU settings Impact of p on Normalized Energy
Impact of Mixed Workload • Extra allocation to extra best-effort applications increases energy consumption • Less time for each application • Increases total CPU demand Impact of Mixed Workload
Table of Contents • About the Authors • Introduction • Algorithm • Implementation • Experimental Results • Related Works • Conclusion
Related Works: Soft Real-Time Scheduling • Proportional Sharing • A. Chandra, M. Adler, P. Goyal, and P. Shenoy. Surplus fair scheduling: A proportional-share CPU scheduling algorithm for symmetric multiprocessors. In Proc. of 4th Symposium on Operating System Design and Implementation, Oct. 2000. • CPU Reservations • M. Jones, D. Rosu, and M. Rosu. CPU reservations & time constraints: Efficient, predictable scheduling of independent activities. In Proc. of 16th Symposium on Operating Systems Principles, Oct. 1997. • Real-Time Scheduling Algorithms • C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogramming in a hard real-time environment. JACM, 20(1):46–61, Jan. 1973. • Stochastic Scheduling • K. Gardner. Probabilistic analysis and scheduling of critical soft real-time systems. PhD thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, 1999.
Related Works: Dynamic Voltage Scaling • General Purpose DVS based on Average CPU Utilization • D. Grunwald, P. Levis, K. Farkas, C. Morrey III, and M. Neufeld. Policies for dynamic clock scheduling. In Proc. of 4th Symposium on Operating System Design and Implementation, Oct. 2000. • Real Time DVS • P. Pillai and K. G. Shin. Real-time dynamic voltage scaling for low-power embedded operating systems. In Proc. of 18th Symposium on Operating Systems Principles, Oct. 2001. • Stochastic DVS • J. Lorch and A. Smith. Improving dynamic voltage scaling algorithms with PACE. In Proc. of ACM SIGMETRICS 2001 Conference, June 2001. • F. Gruian. Hard real-time scheduling for low energy using stochastic data and DVS processors. In Proc. Of Intl. Symp. on Low-Power Electronics and Design, Aug. 2001.
Table of Contents • About the Authors • Introduction • Algorithm • Implementation • Experimental Results • Related Works • Conclusion
Conclusion • Pros • Optimizes multiple layers of system resources • Conserve energy while ensuring quality of service • Small overhead • Support for multiple tasks • Thorough testing • Cons • Estimate energy savings without measurement • Testing limited to multimedia applications • Limited number of tests per codec • 8 runs per test • Discard largest and smallest values • Limited CPU speed settings decreases energy savings