230 likes | 258 Views
Cache Utilization-Aware Scheduling for Multicore Processors . Presenter: Chi-Wei Fang YunTech University, Taiwan Authors: Edward T.-H. Chu, Wen-wei Lu . 2012 IEEE Asia Pacific Conference on Circuits and Systems. 1. 1. Outline. Introduction Contribution CUAS Experiment Conclusion.
E N D
Cache Utilization-Aware Scheduling for Multicore Processors Presenter: Chi-Wei Fang YunTech University, Taiwan Authors: Edward T.-H. Chu, Wen-wei Lu 2012 IEEE Asia Pacific Conference on Circuits and Systems 1 1
Outline Introduction Contribution CUAS Experiment Conclusion
Introduction Due to the limitation of semiconductor process processor speed is not expected to have a significant raise In order to further improve the capability of processor, chip multiprocessor (CMP) has become widespread in today’s computer systems
Introduction Intel® Core™2 Quad Processor Q8400 architecture In most multicore processors, the last level cache is shared among cores to reduce possible resource underutilization. As figure shows, in the Intel® Core™2 Quad Processor, L2 caches are shared among cores
Introduction • When the tasks running on different cores read and write shared cache intensively, excessive cache miss may occur and result in performance degradation • Reduce the shared cache contention of multicore systems becomes an important design issue • J.Mars designed cipe[1] to classifies the tasks according to its abilities of anti-interference • Which is defined as the performance lose when the task competes shared cache with other tasks
Introduction If tasks have similar anti-interference abilities, it becomes difficult for the methods to generate a proper task assignment In addition, how a task interferences co-scheduled task depends on how aggressively the task accesses cache A task with little anti-interference ability may or may not seriously interferences co-scheduled applications.
Motivation • The optimal algorithm exhaustively searches all possible task assignments and selects the one with the smallest total execution time • Because of the gap between existing methods and the optimal algorithm, there is an apparent need to designa task scheduling policy to reduce shared cache contention and improve system performance
Outline Introduction Contribution CUAS Experiment Conclusion
Contribution Cache utilization aware scheduling (CUAS) • Goal:maximize the difference of unhealthy level of each core that shares the same cache while balancing workload among cores • We define the unhealthy level of a core as the sum of unhealthy scores of tasks running on this core • CUAS includes two parts • Application classifier • Task scheduler CUAS can reduce cache contention
Outline Introduction Contribution CUAS Experiment Conclusion
CUAS classification We designed two micro-benchmarks to measure the anti-interference and interference ability of a task • Attack (ATT) • Strong interference ability • Randomly and intensively pollutes all cache lines • Defend (DEF) • Strong anti-interference ability • Sequentially read and writeeach cache line
CUAS classification • Based on the results of co-scheduling with ATT and DEF, we grade each task’s anti-interference and interference ability ATT DEF Task Task L2 cache L2 cache Anti-interference ability Interference ability
CUAS classification There are three formula to calculate the unhealthy score of task (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) A’d is the execution time of DEF when it is co-scheduled with the task Ad is the execution time of DEF execute solely (2) (2) (2) (2) (2) (2) (2) A’i is the execution time of ATT when it is co-scheduled with task Ai is the execution time of task execute solely A’i is the execution time of ATT when it is co-scheduled with task Ai is the execution time of task execute solely A’i is the execution time of ATT when it is co-scheduled with task Ai is the execution time of task execute solely A’i is the execution time of ATT when it is co-scheduled with task Ai is the execution time of task execute solely A’i is the execution time of ATT when it is co-scheduled with task Ai is the execution time of task execute solely (3) (3) (3) (3) The unhealthy score of a task is the sum of its I and AI. A task with higher unhealthy scores will have more negative impact on system performance
The goal of CUAS scheduler Maximize the unhealthy scores gap between the cores that share the same cache Balance the workload among cores
CUAS steps Calculate the number of tasks for each core We first assign the a‘th largest unhealth score tasks to the core0 of the first cache In order to avoid the unhealth tasks effect each other,we assign the next a’th largest unhealth core tasks to the core0 of another cache We assign the tasks from cache n to cache 1 in the next turn
CUAS scheduling Core 0 Core 1 Core 0 Core 1 1 2 3 4 6 5 L2 cache L2 cache Cache 1 Cache 2 Classify result Scheduling by classification result
Outline • Introduction • Contribution • CUAS • Experiment • Conclusion
Experiment • We adopted Intel Core2 Quad Q8400 CPU for our experiment • Four cores are arranged into two groups of two cores and each group shares a 2MB L2 cache • Adopted SPEC CPU2006 benchmark for evaluation
Experiment The classify result of CUAS The reduction of total execution time at most46%
Outline • Introduction • Contribution • CUAS • Experiment • Conclusion
Conclusion • In this work, we design a novel task scheduling, called CUAS, to reduce shared cache contention based on two indexes, intra-core cache contention and task interference ability, that primarily determine the utilization of shared cached • CUAS first classifies tasks according to their anti-interference ability and interference ability. • CUAS then distributes tasks to cores based on the effect of inter-core and intra-core cache contention
Conclusion • Our experiment results shows that CUAS can significantly reduce shared cache contention and reduce total execution time at most 46% compared to existing methods
Thanks for attention Embedded Operating System Lab at Yuntech University http://eos.yuntech.edu.tw/eoslab/ Supported by NSC 100-2219-E-224-001