240 likes | 379 Views
A Hierarchical CPU Scheduler for Multimedia Operating Systems. EE202A, Fall 2001 Prof. Mani Srivastava Hanbiao Wang, Arun Somasundara. Motivation. Various application classes ( hard real-time, soft real-time, best efforts ) in multimedia system need different scheduling algorithms
E N D
A Hierarchical CPU Scheduler for Multimedia Operating Systems EE202A, Fall 2001 Prof. Mani Srivastava Hanbiao Wang, Arun Somasundara
Motivation • Various application classes (hard real-time, soft real-time, best efforts) in multimedia system need different scheduling algorithms • Hierarchical CPU scheduling supports different scheduling algorithms for different application as well as protects application classes from one another
Framework • A tree structure, each node has a weight and a scheduler • Each leaf node represents an aggregation of threads, and hence an application class • Each non-leaf node represents an aggregation of application classes
Requirements of Scheduling at Non-leaf Nodes • Fairness among child nodes • No a priori knowledge of time duration a task executes before blocked • Bounds on minimum throughput and maximum delay • Computationally efficient
SFQ Algorithm • Virtual time v(t) • Initially 0 • Start tag of the thread in service, when the CPU is busy at time t • Maximum finish tag assigned to any thread, when the CPU is idle at time t
SFQ Algorithm contd. • Start tag Sf • Sf = max{v(A(qjf)), Ff}, stamped to thread f when qjf is requested • Finish tag Ff • Initially 0 • Ff = Sf + ljf/rf, incremented when qjf finishes execution • Threads serviced in increasing order of start tags; ties broken arbitrarily
SFQ Scheduling Example rA:rB = 1:2 qA = qB = 10ms lA = lB = 10
SFQ Properties • Fair allocation of CPU regardless of bandwidth variation. Any threads f, m • |Wf(t1,t2)/rf – Wm(t1,t2)/rm| lfmax/rf + lmmax/rm • Wx(t1,t2) is the aggregate work done by CPU in interval [t1,t2] for thread x, lxmax is the maximum length of quantum for which thread x is scheduled • No need of a priori knowledge of the quantum length since scheduling relies on increasing order of start tags
SFQ Properties contd.-Bounds on throughput and delay • Interrupted CPU modeled as Fluctuation Constrained server (FC) with average rate C (instructions/sec) and burstiness (C) (instructions), effective bandwidth: • W(t1, t2) C * (t2 - t1) - (C) • Throughput of thread f (rf)is also FC: • (rf, rf * (n{lnmax} + (C) ) / C + lfmax) • Execution Complete Time of qjf: • L(qjf) EAT(qjf)+(nf{lnmax}+ljf+ (C))/C • where EAT(qjf) = max{A(qjf),EAT(qj-1f)+lj-1f/rf}
SFQ Properties contd.-Bounds on throughput and delay • Interrupted CPU modeled as Exponentially Bounded Fluctuation (EBF) server with (C, B, , (C)), effective bandwidth distribution: • P{W(t1, t2) < C * (t2 - t1) - (C) - } Be- • Throughput of thread f (rf)is also EBF: • (rf, B, rf * /C, rf * ( n{lnmax} + (C)) / C + lfmax) • Execution Complete Time of qjf: • P{L(qjf) EAT(qjf)+(nf{lnmax}+ljf+(C)+ )/C} 1 - Be-
Implementation • SFQ Scheduling is used for intermediate nodes (non-leaf) • Leaf nodes can use any algorithm • Each node • Weight, start tag, finish tag • Each non-leaf node • list of child nodes • list of runnable child nodes sorted by start tags • virtual time of the child node having the minimum start tag • Each leaf node • pointer to a function (Scheduling for thread).
Implementation contd. • some functions • hsfq_mknod, hsfq_parse, hsfq_move, hsfq_rmnod (only if no child nodes), hsfq_admin, hsfq_schedule, hsfq_update (start and finish tags of all ancestors updated), hsfq_setrun, hsfq_sleep • Priority inversion • Not done between classes • Only within the same leaf • if rate monotonic, Priority inheritance • if SFQ: transferring weights of blocked to the blocking thread.
Experimental Results • Framework • Sun SPARCstation 10 with 32MB RAM. • Multiuser mode • all system processes running • Dhrystone benchmark • Scheduling structure
Experimental Results contd. • Achieves predictable resource allocation • Throughputs of 5 threads running both schemes..
Experimental Results contd. • Scheduling overhead • impact of more threads • impact of depth
Experimental Results contd. • Achieves fair allocation • nodes isolated from each other and throughput depends only on weight.
Experimental Results contd. • Fair allocation when bandwidth is dynamically changed
Related Work • Weighted Fair Queuing (WFQ) • not fair when bandwidth fluctuates • Requires length of quantums to be known a priori • Computation of round number expensive • Fair Queuing based on Start-time (FQS) • modification of WFQ, when quantums not known • increasing order of start tags • computationally expensive • does not provide fairness when bandwidth fluctuates
Related Work contd. • Self Clocked Fair Queuing (SCFQ) • Approximates round number with finish tag • increasing order of finish tags • larger delay guarantee than SFQ • Lottery Scheduling • Fairness only over large time intervals • Supports hierarchical partitioning but • only one scheduling algorithm • computational overhead.
Conclusion • Enables co-existence of heterogeneous schedulers • Protects application classes from each other • Computationally efficient
References • P. Goyal, X. Guo, and H. M. Vin. A Hierarchical CPU Scheduler for Multimedia Operating Systems. Operating Systems Review, vol.30, spec. issue., (Second USENIX Symposium on Operating Systems Design and Implementation (OSDI), Seattle, WA, USA, 28-31 Oct. 1996.) ACM, 1996. p.107-21. • P. Goyal, H. M. Vin, and H. Cheng. Start-time Fair Queuing: A Scheduling Algorithm for Integrated Services Packet Switching Networks. Proceedings of ACM SIGCOMM’96, pages 157-168, August 1996.