820 likes | 840 Views
INF5070 – Media Storage and Distribution Systems:. Server Resources. 12/9 - 2005. Overview. Resources, real-time, “continuous” media streams, … (CPU) Scheduling Memory management. Resources and Real – Time. Resources.
E N D
INF5070 – Media Storage and Distribution Systems: Server Resources 12/9 - 2005
Overview • Resources, real-time, “continuous” media streams, … • (CPU) Scheduling • Memory management
Resources • Resource:“A resource is a system entity required by a task for manipulating data” [Steimetz & Narhstedt 95] • Characteristics: • active: provides a service, e.g., CPU, disk or network adapter • passive: system capabilities required by active resources, e.g., memory • exclusive: only one process at a time can use it, e.g., CPU • shared: can be used by several concurrent processed, e.g., memory • single: exists only once in the system, e.g., loudspeaker • multiple: several within a system, e.g., CPUs in a multi-processor system
Real–Time • Real-time process:“A process which delivers the results of the processing in a given time-span” • Real-time system:“A system in which the correctness of a computation depends not only on obtaining the result, but also upon providing the result on time” • Many real-time applications, e.g.: • temperature control in a nuclear/chemical plant • driven by interrupts from an external device • these interrupts occur irregularly • defense system on a navy boat • driven by interrupts from an external device • these interrupts occur irregularly • control of a flight simulator • execution at periodic intervals • scheduled by timer-services which the application requests from the OS • ...
Real–Time • Deadline:“A deadline represents the latest acceptable time for the presentation of the processing result” • Hard deadlines: • must never be violated system failure • too late results • have no value, e.g., processing weather forecasts • means severe (catastrophic) system failure, e.g., processing of an incoming torpedo signal in a navy boat scenario • Soft deadlines: • in some cases, the deadline might be missed • not too frequently • not by much time • result still may have some (but decreasing) value, e.g., a late I-frame in MPEG
Real–Time and Multimedia • Multimedia systems • have periodic processing requirements (e.g., each 33 ms in a 30 fps video) • require large bandwidths (e.g., average of 3.5 Mbps for DVD video only) • typically have soft deadlines (may miss a frame) • are non-critical (user may be annoyed, but …) • need predictability (guarantees) • adapt real-time mechanisms to continuous media • priority-based schemes are of special importance
Admission and Reservation • To prevent overload, admission may be performed: • schedulability test: • “are there enough resources available for a new stream?” • “can we find a schedule for the new task without disturbing the existing workload?” • a task is allowed if the utilization remains < 1 • yes – allow new task, allocate/reserve resources • no – reject • Resource reservation is analogous to booking(asking for resources) • pessimistic • avoid resource conflicts making worst-case reservations • potentially under-utilized resources • guaranteed QoS • optimistic • reserve according to average load • high utilization • overload may occur • perfect • must have detailed knowledge about resource requirements of all processes • too expensive to make/takes much time
Real–Time and Operating Systems • The operating system manages local resources(CPU, memory, disk, network card, busses, ...) • In a real-time, multimedia scenario, support is needed for: • real-time processing • efficient memory management • This also means support for proper … • scheduling – high priorities for time-restrictive multimedia tasks • timer support – clock with fine granularity and event scheduling with high accuracy • kernel preemption – avoid long periods where low priority processes cannot be interrupted • memory replacement – prevent code for real-time programs from being paged out • fast switching – both interrupts and context switching should be fast • ...
arrive function data offset send function read function consume function time t1 Streaming Data • Start playback at t1 • Consumed bytes (offset) • variable rate • constant rate • Must start retrieving data earlier • Data must arrive beforeconsumption time • Data must be sentbefore arrival time • Data must be read from disk before sending time
arrive function data offset consume function time t1 t0 Streaming Data • Need buffers to hold data between the functions, e.g., client B(t) = A(t) – C(t), i.e., t :A(t) ≥ C(t) • Latest start of data arrival is given by min[B(t,t0,t1) ; t B(t,t0,t1) ≥ 0],i.e., the buffer must at all times t have more data to consume
application file system communication system Streaming Data • “Continuous Media” and “continuous streams” are ILLUSIONS • retrieve data in blocks from disk • transfer blocks from file system to application • send packets to communication system • split packets into appropriate MTUs • ... (intermediate nodes) • ... (client) • different optimal sizes • pseudo-parallel processes (run in time slices) • need for scheduling(to have timing and appropriate resource allocation)
Scheduling • A task is a schedulable entity (a process/thread executing a job, e.g., an packet through the communication system or a disk request through the file system) • In a multi-tasking system, several tasks may wish to use a resource simultaneously • A scheduler decides which task that may use the resource, i.e., determines order by which requests are serviced, using a scheduling algorithm • Each active (CPU, disk, NIC) resources needs a scheduler(passive resources are also “scheduled”, but in a slightly different way) requests scheduler resource
Scheduling • Scheduling algorithm classification: • dynamic • make scheduling decisions at run-time • flexible to adapt • considers only actual task requests and execution time parameters • large run-time overhead finding a schedule • static • make scheduling decisions at off-line (also called pre-run-time) • generates a dispatching table for run-time dispatcher at compile time • needs complete knowledge of task before compiling • small run-time overhead • preemptive • currently executing task may be interrupted (preempted) by higher priority processes • preempted process continues later at the same state • potential frequent contexts switching • (almost!?) useless for disk and network cards • non-preemptive • running tasks will be allowed to finish its time-slot (higher priority processes must wait) • reasonable for short tasks like sending a packet (used by disk and network cards) • less frequent switches
preemption Scheduling requests • Preemption: • tasks waits for processing • scheduler assigns priorities • task with highest priority will be scheduled first • preempt current execution if a higher priority (more urgent) task arrives • real-time and best effort priorities(real-time processes have higher priority - if exists, they will run) • to kinds of preemption: • preemption points • predictable overhead • simplified scheduler accounting • immediate preemption • needed for hard real-time systems • needs special timers and fast interrupt and context switch handling scheduler resource
request request request process 2 process 2 process 2 process 2 process 3 process 3 process 3 process 3 process 4 process 4 process 4 process 4 … … … … process N process N process N process N only delay switching and interrupts process 1 p 1 p 1 Scheduling • Scheduling is difficult and takes time: RT process delay round-robin process 1 process 2 process 3 process 4 … process N RT process RT process delay priority,non-preemtive process 1 RT process RT process priority,preemtive p 1 RT process
Priorities and Multimedia • Multimedia streams need predictable access to resources – high priorities, e.g.: • Within each class one could have a second-level scheduler • 1 and 2: real-time scheduling and fine grained priorities • 3: may use traditional approaches as round-robin 1. multimedia traffic with guaranteed QoS may not exist 2. multimedia traffic with predictive QoS 3. other requests must not starve
Scheduling in Windows 2000 • Preemptive kernel • Schedules threads individually • Time slices given in quantums • 3 quantums = 1 clock interval (length of interval may vary) • defaults: • Win2000 server: 36 quantums • Win2000 workstation (professional) : 6 quantums • may manually be increased between threads (1x, 2x, 4x, 6x) • foreground quantum boost (add 0x, 1x, 2x): active window can get longer time slices (assumed needs fast response)
Scheduling in Windows 2000 Real Time (system thread) • 32 priority levels: Round Robin (RR) within each level • Interactive and throughput-oriented: • “Real time” – 16 system levels • fixed priority • may run forever • Variable – 15 user levels • priority may change:thread priority = process priority ± 2 • uses much drops • user interactions, I/O completions increase • Idle/zero-page thread – 1 system level • runs whenever there are no other processes to run • e.g., clearing memory pages for memory manager Variable (user thread) Idle (system thread)
Scheduling in Linux SHED_FIFO • Preemptive kernel • Threads and processes used to be equal, but Linux uses (in 2.6) thread scheduling • SHED_FIFO • may run forever, no timeslices • may use it’s own scheduling algorithm • SHED_RR • each priority in RR • timeslices of 10 ms (quantums) • SHED_OTHER • ordinary user processes • uses “nice”-values: 1≤ priority≤40 • timeslices of 10 ms (quantums) • Threads with highest goodness are selected first: • realtime (FIFO andRR):goodness = 1000 + priority • timesharing (OTHER): goodness = (quantum > 0 ? quantum + priority : 0) • Quantums are reset when no ready process has quantums left (end of epoch):quantum = (quantum/2) + priority SHED_RR nice SHED_OTHER
Scheduling in AIX SHED_FIFO • Similar to Linux, but has always only used thread scheduling • SHED_FIFO • SHED_RR • SHED_OTHER • BUT, SHED_OTHER may change “nice” values • running long (whole timeslices) penalty – nice increase • interrupted (e.g., I/O) gives initial “nice” value back SHED_RR nice SHED_OTHER
p d e time s Real–Time Scheduling • Multimedia streams are usually periodic(fixed frame rates and audio sample frequencies) • Time constraints for a periodic task: • s – starting point(first time the task require processing) • e – processing time • d – deadline • p – period (r – rate (r = 1/p)) • 0 ≤ e ≤ d (often d ≤ p: we’ll use d = p – end of period, but Σd ≤ Σp is enough) • the kth processing of the task • is ready at time s + (k – 1) p • must be finished at time s + (k – 1) p + d • the scheduling algorithm must account for these properties
Real–Time Scheduling • Resource reservation • QoS can be guaranteed • relies on knowledge of tasks • no fairness • origin: time sharing operating systems • e.g., earliest deadline first (EDF) and rate monotonic (RM)(AQUA, HeiTS, RT Upcalls, ...) • Proportional share resource allocation • no guarantees • requirements are specified by a relative share • allocation in proportion to competing shares • size of a share depends on system state and time • origin: packet switched networks • e.g., Scheduler for Multimedia And Real-Time (SMART)(Lottery, Stride, Move-to-Rear List, ...)
Earliest Deadline First (EDF) • Preemptive scheduling based on dynamic task priorities • Task with closest deadline has highest priority stream priorities vary with time • Dispatcher selects the highest priority task • Assumptions: • requests for all tasks with deadlines are periodic • the deadline of a task is equal to the end on its period (starting of next) • independent tasks (no precedence) • run-time for each task is known and constant • context switches can be ignored
priority A < priority B priority A > priority B Earliest Deadline First (EDF) • Example: deadlines Task A time Task B Dispatching
Rate Monotonic (RM) Scheduling • Classic algorithm for hard real-time systems with one CPU [Liu & Layland ‘73] • Pre-emptive scheduling based on static task priorities • Optimal: no other algorithms with static task priorities can schedule tasks that cannot be scheduled by RM • Assumptions: • requests for all tasks with deadlines are periodic • the deadline of a task is equal to the end on its period (starting of next) • independent tasks (no precedence) • run-time for each task is known and constant • context switches can be ignored • any non-periodic task has no deadline
p1 Rate Monotonic (RM) Scheduling shortest period, highest priority • Process priority based on task periods • task with shortest period gets highest static priority • task with longest period gets lowest static priority • dispatcher always selects task requests with highest priority • Example: priority longest period, lowest priority period length Task 1 p2 P1 < P2 P1 highest priority Task 2 Dispatching
Fixed priorities,A has priority, dropping waste of time deadline miss deadline miss deadline miss deadline miss Fixed priorities,B has priority, dropping deadline miss Rate monotonic (as the first) EDF Versus RM • It might be impossible to prevent deadline misses in a strict, fixed priority system: deadlines Task A time Task B Fixed priorities,A has priority, no dropping waste of time waste of time Fixed priorities,B has priority, no dropping RM may give some deadline violationswhich is avoided by EDF Earliest deadline first
NOTE: this means that EDF is usually more efficient than RM, i.e., if switches are free and EDF uses resources ≤ 1, then RM may need ≤ ln(2) resources to schedule the same workload EDF Versus RM • EDF • dynamic priorities changing in time • overhead in priority switching • QoS calculation – maximal throughput: Ri x ei ≤ 1, R – rate, e – processing time • RM • static priorities based on periods • may map priority onto fixed OS priorities (like Linux) • QoS calculation: Ri x ei ≤ ln(2), R – rate, e – processing time all streams i all streams i
SMART(Scheduler for Multimedia And Real–Time applications) • Designed for multimedia and real-time applications • Principles • priority – high priority tasks should not suffer degradation due to presence of low priority tasks • proportional sharing – allocate resources proportionally and distribute unused resources (work conserving) • tradeoff immediate fairness – real-time and less competitive processes (short-lived, interactive, I/O-bound, ...) get instantaneous higher shares • graceful transitions – adapt smoothly to resource demand changes • notification – notify applications of resource changes • Proportional shares no admission control
SMART(Scheduler for Multimedia And Real–Time applications) • Tasks have importance and urgency • urgency – an immediate real-time constraint, short deadline(determine when a task will get resources) • importance – a priority measure • expressed by a tuple: [ priority p, biased virtual finishing time bvft] • p is static: supplied by user or assigned a default value • bvft is dynamic: • virtual finishing time: degree to which the share was consumed • bias: bonus for interactive tasks • Best effort schedule based on urgency and importance • find most important tasks – compare tuple:T1 > T2 (p1 > p2) (p1 = p2 bvft1 > bvft2) • sort after urgency (EDF based sorting) • iteratively select task from candidate set as long as schedule is feasible
Evaluation of a Real–Time Scheduling • Tests performed • by IBM (1993) • executing tasks with and without EDF • on an 57 MHz, 32 MB RAM, AIX Power 1 • Video playback program: • one real-time process • read compressed data • decompress data • present video frames via X server to user • process requires 15 timeslots of 28 ms each per second 42 % of the CPU time
Evaluation of a Real–Time Scheduling 3 load processes(competing with the video playback) the real-time scheduler reaches all its deadlines laxity (remaining time to deadline) several deadlineviolations by the non-real-timescheduler task number
Evaluation of a Real–Time Scheduling Varied the number of load processes(competing with the video playback) Only video process 4 other processes laxity (remaining time to deadline) 16 other processes NB! The EDF scheduler kept its deadlines task number
Evaluation of a Real–Time Scheduling • Tests again performed • by IBM (1993) • on an 57 MHz, 32 MB RAM, AIX Power 1 • “Stupid” end system program: • 3 real-time processes only requesting CPU cycles • each process requires 15 timeslots of 21 ms each per second 31.5 % of the CPU time each 94.5 % of the CPU time required for real-time tasks
Evaluation of a Real–Time Scheduling 1 load process(competing with the real-time processes) the real-time scheduler reaches all its deadlines laxity (remaining time to deadline) task number
Evaluation of a Real–Time Scheduling 16 load process(competing with the real-time processes) process 1 Regardless of other load, the EDF-scheduler reach its deadlines(laxity almost equal as in 1 load process scenario) laxity (remaining time to deadline) process 2 NOTE: Processes are scheduled in same order process 3 task number
bus(es) Delivery Systems Network
application user space kernel space file system communication system Delivery Systems • several in-memory data movements and context switches • several disk-to-memory transfers bus(es)
caching possible Memory Caching • How do we manage a cache? • how much memory to use? • how much data to prefetch? • which data item to replace? • … application cache file system communication system expensive disk network card
Is Caching Useful in a Multimedia Scenario? • High rate data may need lots of memory for caching… • Tradeoff: amount of memory, algorithms complexity, gain, … • Cache only frequently used data – how?(e.g., first (small) parts of a broadcast partitioning scheme, allow “top-ten” only, …) Maximum amount of memory (totally) that a Dell Server can manage in 2004 – and all is NOT used for caching
Need For Special “Multimedia Algorithms” ? In this case, LRU replaces the next needed frame. So the answer is in many cases YES… • Most existing systems use an LRU-variant • keep a sorted list • replace first in list • insert new data elements at the end • if a data element is re-accessed (e.g., new client or rewind), move back to the end of the list • Extreme example – video frame playout: shortest time since access longest time since access LRU buffer play video (7 frames): 3 4 1 2 7 5 6 1 7 2 3 4 5 6 rewind and restart playout at1: 2 1 3 4 5 6 7 playout2: 3 2 4 5 6 7 1 playout3: 4 3 5 6 7 1 2 playout4:
“Classification” of Mechanisms • Block-level caching consider (possibly unrelated) set of blocks • each data element is viewed upon as an independent item • usually used in “traditional” systems • e.g., FIFO, LRU, CLOCK, … • multimedia (video) approaches: • Least/Most Relevant for Presentation (L/MRP) • … • Stream-dependent caching consider a stream object as a whole • related data elements are treated in the same way • research prototypes in multimedia systems • e.g., • BASIC • DISTANCE • Interval Caching (IC) • Generalized Interval Caching (GIC) • Split and Merge (SAM) • SHR
Buffer request request Least/Most Relevant for Presentation (L/MRP) [Moser et al. 95] • L/MRP is a buffer management mechanism for a single interactive, continuous data stream • adaptable to individual multimedia applications • preloads units most relevant for presentation from disk • replaces units least relevant for presentation • client pull based architecture Homogeneous stream e.g., MJPEG video Continuous Presentation Units (COPU) e.g., MJPEG video frames Server Client
referenced history skipped X X X current presentation point relevance value 1.0 16 18 0.8 20 0.6 22 0.4 24 0.2 26 0 COPU number 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Least/Most Relevant for Presentation (L/MRP) [Moser et al. 95] • Relevance values are calculated with respect to current playout of the multimedia stream • presentation point (current position in file) • mode / speed (forward, backward, FF, FB, jump) • relevance functions are configurable COPUs – continuous object presentation units playback direction 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 18 17 16 15 19 14 20 21 13 22 12 23 11 10 24 25 26
global relevance value loaded frames current presentation point S2 current presentation point S1 Bookmark-Set Least/Most Relevant for Presentation (L/MRP) [Moser et al. 95] • Global relevance value • each COPU can have more than one relevance value • bookmark sets (known interaction points) • several viewers (clients) of the same • = maximumrelevance for each COPU Relevance 1 0 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 ... ... Referenced-Set History-Set