240 likes | 337 Views
Algorithms for Generalized Caching. The Paging/Caching Problem. Set of n pages, cache of size k<n. Request sequence of pages 1, 6, 4, 1, 4, 7, 6, 1, … If requested page already in cache, no penalty. Else, cache miss: Fetch page in cache (possibly) evicting some other page.
E N D
The Paging/Caching Problem • Set of n pages, cache of size k<n. • Request sequence of pages1, 6, 4, 1, 4, 7, 6, 1, … If requested page already in cache, no penalty. Else, cache miss: Fetch page in cache (possibly) evicting some other page. Goal: Minimize the number of cache misses. Main Question: Which page to evict?
Previous Results: Paging Paging (Deterministic) [Sleator Tarjan 85]: • Any det. algorithm ¸k-competitive. • LRU is k-competitive (also other algorithms) Paging (Randomized): • Rand. Marking O(log k)[Fiat, Karp, Luby, McGeoch, Sleator, Young 91]. • Lower bound Hk[Fiat et al. 91], tight results known.
Generalized Caching First Extension: • Pages have different fetching costs. • Models scenarios in which the fetching cost is not uniform: Main memory, disk, internet … Second (Orthogonal) Extension: • Pages have different sizes (in range [1,k]) • Models web-caching problems (Proxy Servers, local cache in browser) web
Generalized Paging Deterministic: (k+1)-competitive [Cao, Irani 97], [ Young 98] Randomized: • General Problem: Nothing better • Special cases [Irani 97] Bit Model (wt = size): O(log2 k) Fault Model (wt = 1): O(log2 k) Offline:NP-Hard, 4-approximation [Bar-noy, Bar-yehuda, Freund, Naor, Schieber 01]
Our Results Thm: O(log2 k) competitive alg. for Generalized Caching Thm: O(log k) for bit and fault model. Main Technique: Primal Dual Method for Online Algorithms (formalized by Buchbinder and Naor 05) [based on multiplicative updates, several previous works…] Recent Update: Improved to O(log k) [Adamaszek, Czumaj, Englert, Räcke]
High level approach First step: • O(log k)-competitive for fractional generalized caching. • Maintains fractions on pages. Second Step: Transform fractional solution into Randomized Alg.: • Maintain distribution on cache states that is “consistent” with the fractional solution. • Keep costs comparable We lose O(1) for Bit/Fault model, O(log k) for general model.
Rest of the Talk • Online Primal Dual Method • Formulation of Caching Problem • Knapsack cover inequalities • Final Algorithm
An Abstract Online Problem min 3 x1 + 5 x2 + x3 + 4 x4 + … 2 x1 + x3 + x6 + … ¸ 3 x3 + x14 + x19 + … ¸ 8 x2 + 7 x4 + x12 + … ¸ 2 Goal: Find feasible solution x* with min cost. Requirements: 1) Upon arrival constraint must be satisfied 2) Cannot decrease a variable. Covering LP (non-negative entries)
Example min x1 + x2 + … + xn x1 + x2 + x3 + … + xn¸ 1 x2 + x3 + … + xn¸ 1 x3 + … + xn¸ 1 … xn¸ 1 Online ¸ln n(1+1/2+ 1/3+ … + 1/n) Opt = 1( xn=1 suffices) Set all xi to 1/n Increase x2 ,x3,…,xn to 1/n-1 … Increase xn to 1
General Covering Results Thm[Buchbinder-Naor 05]:O(log n) competitive fractional alg. (n – number of variables). • Can improve if more structure (e.g. O(log k) for caching, O(ln d) for d-sparse matrix) • Can add “box” constraints to covering LP (e.g. x · 1) • Tradeoff feasibility vs. competitive ratio • Allow variables to increase or decrease • More general forms than covering/packing
Online Primal Dual Approach Primal updated based on dual. Recall, any dual solution is a lower bound on optimum. At each step: Primal ·c ( Dual) (if primal and dual feasible) ! c competitive fractional algorithm Algorithm is always similar.
Key Idea for Online Primal Dual Primal: Min i ci xi Dual Step t, newconstraint:New variable yt a1x1 + a2x2 + … + ajxj¸ bt + bt yt in dual objective How much: xi ? yt! yt + 1 (additive update) primal cost = = Dual Cost dxi/dy = ai xi/ci so, x varies as exp(y)
How to initialize A problem: dx/dy is proportional to x, but x=0 initially. So, x will remain equal to 0 ? Answer: Initialize to 1/n. Complementary slackness : x > 0 only if dual constraint for x is tight. Set x=1/n when its dual constraint becomes tight.
Rest of the Talk • Online Primal Dual Method • Formulation of Caching Problem • Knapsack cover inequalities • Final Algorithm
Fractional covering LP At each time we have a profile of how much “fraction” of each page has been evicted. i wi xi¸ W - k (W = total size of all pages) Real variables: x(i,j) j-th request for page i Initially 0 when page requested, then increases as page evicted over time. If x(i,j) increases by , pay c(i)
Large Integrality Gap Input: Two pages of size k/2+1 requested alternately. 1) Any solution must fault on every step. 2) But LP can hold 1-O(1/k) fraction of each page, incurring only O(1/k) cost at each step. Cannot hope to show o(k) guarantees.
Rest of the Talk • Online Primal Dual Method • Formulation of Caching Problem • Knapsack cover inequalities • Final Algorithm
Knapsack Cover Inequalities Knapsack Cover Problem: Knapsack of size D, n items with sizes w1,w2,…,wn and cost c1,..,cn Goal: Choose min cost subset, that covers knapsack (i.e. total size ¸ D) LP: min ci xi s.t. i wi xi¸ D, xi = {0,1} 2 [0,1] Bad example: Only one item of size 1000 D. LP can just choose 1/1000 of this item. Simple fix: min ci xi s.t. imin(wi, D) xi¸ D
Still Bad Bad example: 2 items, s1 = s2 = D-1 c1=0, c2= L Any solution must choose item 2. (Integer cost = L) LP: sets x1=1 and x2 = 1/(D-1) (LP cost = L/(D-1)) Consider a subset of jobs S. Even if choose all items in S, still need to cover at least D-W(S) size with remaining items. Thm: Factor 2 in general (Carr, Fleischer, Leung, Phillips’ 97) Exponential constraints, but poly time.
Sketch of Primal-Dual algorithm • Consider the LP strengthened with Knapsack inequalites (exponentially many per time step) • While there exist an unsatisfied primal constraint of set of pages S and time t: • Increase the dual variable y(t,S). And update primal according to general recipe. Thm: Gives O(log k)-competitive fractional algorithm.
Fractional -> Randomized At each time t, LP gives a fractional state p = (p1,…,pn) Say p ! p’ at fractional cost = Randomized alg: Distribution on cache states D -> D’ cost should be close to Offline rounding (K.C. inequalities) + Online maintainability Easy: If lose O(log W) and O(log P) (reduce to uniform) We avoid for bit and fault model, but lose O(log k) for general model.
Concluding Remarks • Primal-dual approach gives simple unifying framework for caching and many online problems. • Strong relation to multiplicative updates technique • Recently, used to obtain the first polylog(n,k) competitive algorithm for k-server problem on general metrics [B., Buchbinder, Naor, Madry 11]