360 likes | 369 Views
This paper discusses the challenges of managing large amounts of multimedia data in digital libraries and proposes solutions for ensuring quality of service guarantees. It covers topics such as performance tuning, self-tuning servers, QoS for continuous-data streams, caching and prefetching, and more.
E N D
Quality of Service Guarantees for Multimedia Digital Libraries and Beyond Gerhard Weikum weikum@cs.uni-sb.de http://www-dbs.cs.uni-sb.de
Challenges: • size of data • performance & QoS • intelligent search Vannevar Bush’s Memex (1945) Collect all human knowledge into computer storage Size of today‘s and tomorrow‘s applications: Library of Congress: 20 TB books + 200 TB maps + 500 TB video + 2 PB audio Everything you see or hear: 1 MB/s * 50 years 2 PB
Multimedia Data Management . . . Clients High-speed Network QoS Guarantees by Data Server with QoS Guarantees Discrete Data Index Data Continuous Data Server Memory Buffer Parallel Disk System
Internal Server Error. Our system administrator has been notified. Please try later again. Observations: • Service performance is best-effort only • Response time is unacceptable during peak load • because of queueing delays • Performance is mostly unpredictable ! The Need for Performance and QoS Guarantees Check Availability (Look-Up Will Take 8-25 Seconds)
From Best Effort To Performance & QoSGuarantees ”Our ability to analyze and predict the performance of the enormously complex software systems ... are painfully inadequate” (Report of the US President’s Technology Advisory Committee) • Very slow servers are like unavailable servers • Tuning for peak load requires predictability • of workload config performance function • Self-tuning requires mathematical models • Stochastic guarantees for huge #clients
Outline The Need for Performance Guarantees Self-tuning Servers using Stochastic Predictions QoS for Continuous-Data Streams Caching and Prefetching for Discrete Data Towards a Science of QoS Guarantees
Performance and Service Qualityof Continuous-Data Streams Quality of service (QoS): (almost) no "glitches" High throughput (= concurrently active streams) admission control
1 Admission control: 1 1 1 1 2 1 1 1 1 1 1 1 0 0 0 T T T 2T 2T 2T 3T 3T 3T 2 2 2 2 2 2 2 1 1 3 3 2 2 3 3 3 3 3 Data Placement and Scheduling Partitioning of C-data Objects with VBR (Variable Bit Rate) into CTL Fragments (of Constant Time Length) Coarse-grained Striping with Round-robin Allocation Periodic, Variable-order Scheduling Organized in Rounds of Duration T (= Fragment Time Length) No way! Yes, go ahead! Now go ahead! ...
Stochastic QoS: Admit at most N streams such that P [ total service time > T ] - tolerable by most multimedia applications - appropriate with many workload and system parameters being random variables - allows much better resource utilization compared to worst-case modeling Admission Control with Stochastic QoS Guarantees Worst-case QoS: Admit at most N streams such that N * Tmax T
Mathematical Tools X, Y, ...: continuous random variables with non-negative, real values (cumulative) distribution function of X probability density function of X Laplace-Stieltjes transform (LST) of X Convolution Chernoff bound
with Total Service Time Per Round(With N Streams) N N = + + Ttrans,i T T T å å serv seek rot , i = = i 1 i 1
plate N worst-case analytic real Stochastic versus Worst-Case QoS Guarantees
0 T 2T 3T response time Additional performance guarantee for discrete-data requests: response time (e.g., with t = 2s and = 0.95) response time Needs clever scheduling and sophisticated stochastic model, to provide both continuous-data and discrete-data guarantees Generalization to Mixed-Workload Servers arrivals of discrete-data requests 4T departures of completed discrete-data requests
(e.g., t = 2 seconds, = 5 percent) Auto-Configuration of Data Server: Detailed analytic model can derive minimum-cost server configuration for specified QoS & performance requirements incl. differentiated QoS for multiple user/request classes QoS & Performance Guarantees for Mixed Workload Servers P [ glitch frequency of a stream > tolerance ] for Continuous Data P [ admission/startup delay of a stream > tolerance ] for Discrete Data P [ response time > tolerance t ]
Outline The Need for Performance Guarantees Self-tuning Servers using Stochastic Predictions QoS for Continuous-Data Streams Caching and Prefetching for Discrete Data Towards a Science of QoS Guarantees
Caching Prefetching The Need for Caching in Storage Hierarchies Clients DL server Search engine 50 GB Proxy 100 TB ... Internet 5 TB Ontologies, XML etc. Very high access latency !
LRU-k: Drop page with the oldest k-th last reference estimates heat (p) = optimal for IRM Basic Caching Policies LRU: Drop page that has been least recently used Example: A A A B B B C C C D D D X X X X X X Y Y Y Y Y Y time 1 2 3 4 5 10 15 20 24 now
Theorem: LRU-k Optimality IRM: pages 1 ... n with ref. probabilities 1 ... n (i i+1) and backward distances b1 ... bn 3 2 3 1 2 2 3 3 2 1 time b2 now
for observation b1, ..., bn with bi < b i+1 maximize for k << bi LRU-k as Maximum Likelihood Estimator IRM: pages 1 ... n with ref. probabilities 1 ... n (i i+1) and backward distances b1 ... bn 3 2 3 1 2 2 3 3 2 1 time b2 now
Cost / throughput consideration: Keep page in cache if Cost / response-time consideration: Keep page in cache if Response-time guarantee: Minimum cache size M such that Cache Size Configuration
LRU-k Cache Hit Rate (for Cache Size M) hit rate H(M) [%] cache size M
with M/G/1 queue: Chernoff bound: Stochastic Response Time Guarantee with cache size M, block size S, and multi-zone disk with known seek-time function, Z tracks of capacity Cmin Ci Cmax, rotation time T with LST
Extended LRU-k-based Policies Generalization to variable-size documents: drop documents with lowest temperature (d) = Generalization to non-uniform / hierarchical storage: drop documents with lowest benefit (d) = Generalization to cooperative caching in computer cluster
Prefetch x iff benefit(x,T) > {benefit(y,T) | y victims} with benefit (x,T) = Speculative Prefetching Archive Cache Mask high access latency Keep long-term beneficial data in cache Speculative prefetching Throttling of prefetching with time horizon T = „max“ (RTarchive)
Session arrival rate ... Session 2 HN+1=c/ P[time in i t] new sessions Hk=10s Hi=E[...]=10s ... ... Hf=30s Model session behavior as Markov chain Superimpose CTMCs of all active sessions Incorporate arrivals of new sessions Context-aware Prefetching and Caching Session 1 access doc. i access doc. k 0.1 0.1 ... Pif=0.8 0.3 0.1 doc. h doc. f doc. g 0.9 ... ... ... with continuous state-residence times
Uniformization: with = max{i} Transient analysis for time horizon T: N(x,T) = E[#accesses to x in T] = where CTMC-based Access Prediction Given: states di (i=1, ..., N+c) with transition probabilites pij and mean residence times Hi (departure rates i=1/Hi)
Prefetch x iff benefit(x,T) > {benefit(y,T) | y victims} with benefit (x,T) = with MCMin Prefetching and Caching Algorithm access tracking and online bookkeeping for statistics periodic evaluation of N(state(s),T) for active sessions based on approximative CTMC transient analysis prefetching candidates + appropriate device scheduling at server
Overhead: • size of bookkeeping data < 0.02% • compute time per access 1 ms • both dynamically adjustable Performance Experiments Simulations based on WWW-server access patterns Mean response time [s] Cache size / archive size
Applicability of LRU-k and MCMin Family for Internet and intranet proxies and clients with careful management of access statistics for (stochastically) guaranteed response time w.r.t. heterogeneous data servers, as opposed to best-effort caching for data hoarding in mobile clients when client goes on low (or zero) connectivity, prefetch near-future relevant data and programs for adaptive broadcast of data feeds in networks with asymmetric bandwidth for caching of (partial) search results in data warehouses, digital libraries, etc.
Interesting Research Problems ? ? Optimal (online) decisions about amoung of bookkeeping ? ? Response-time guarantee for MCMin ? ? Caching and prefetching for differentiated QoS (multiple user/request classes) ? ? Caching of (partial) search results and prefetching for (speculative) query evaluation in ranked (XML) retrieval
Outline The Need for Performance Guarantees Self-tuning Servers using Stochastic Predictions QoS for Continuous-Data Streams Caching and Prefetching for Discrete Data Towards a Science of QoS Guarantees
Advancing the State of the Art on QoS Benefit of stochastic models and derived algorithms/systems over commercial state-of-the-art systems (e.g., Oracle Media Server, MS NetShow Theater Server, etc.): + Substantially Better Cost/Performance +Predictable Performance + Major Building Blocks for Configuration Tool for Specified QoS Guarantees and Self-tuning, Zero-admin Operation
Responsiveness & Cost-effectivity (Performance) Accuracy Example:Select ... Comprehensiveness Example:association rules of the kind Software Engineering & Y2K Astrology Combined with IR & Multimedia Examples:Where ... P About {„Mining“, „19th Century“} ... Where ... P.Category =„CDs“ And P Sounds Like Timeliness Credibility QoS in (Web) Query Processing
Conceivable killer argument: • But: • An engineer is someone who can do for a dime • what any fool can do for a dollar. • Predictions are very difficult, especially about the future. Infinite RAM & network bandwidth and zero latency (for free) The End need libraries of composable building blocks with predictable behavior and (customizable) QoS guarantees „Web engineering“ for end-to-end QoS will rediscover stochastic modeling or will fail self-tuning servers with guaranteed performance „low-hanging fruit“ engineering: 90% solution with 10% intellectual effort