Advanced Algorithms

Advanced Algorithms Piyush Kumar (Lecture 17: Online Algorithms) Welcome to COT5405

On Bounds • Worst Case. • Average Case: Running time over some distribution of input. (Quicksort) • Amortized Analysis: Worst case bound on sequence of operations. • (Bit Increments, Union-Find) • Competitive Analysis: Compare the cost of an on-line algorithm with an optimal prescient algorithm on any sequence of requests. • Today.

Problem 1 • The online dating game. • You get to date fixed number of partners. • You either choose to pick them up or try your luck again. • You can not go back in time. • What strategy would you use to pick?

Problem 2. • You like to Ski. • When weather AND mood permits, you go skiing • If you own the equipment, you take it with you, Otherwise Rent. • You can buy the equipment whenever you decide, but not while skiing.

Costs • 1 Unit to rent, M units to buy • If you go ski I times, what is OPT? OPT = min (I,M) What algorithm should you use to decide whether you Should buy the equipment?

Algorithms • Algorithm 1: • Buy equipment ofter first day. • Competitive algorithm • CostALG(σ) <= ρCostOPT(σ)+b An Algorithm is called ρ-competitive if there exists some constant b such that for every sequence of inputs σ Cost OPT (σ)=Min(I,M) = 1? Cost ALG (σ)=M ρ >= M

Algorithms • Algorithm 2: Rent for (M-1) days and buy on Mth day. • L < M : CostALG(σ) = CostOPT(σ) • L >= M : CostALG(σ) = 2M – 1 CostOPT(σ) = M • Competitive ratio = 2 – 1/M

Ski Rental • Alg 3: Rent for k days and buy on (k+1)th day. • CostALG(σ) = k+M • CostOPT(σ) = min(M,k) • Competitive ratio = 2?

Problem 3: (1D)Monkey Looking for food Hidden What is the best competitive algorithm you can come up With? What is its competitive ratio?

Problem 3.(3D) • Monkey looking for food. Hidden

On Line Algorithms • Work without full knowledge of the future • Deal with a sequence of events • Future events are unknown to the algorithm • The algorithm has to deal with one event at each time. The next event happens only after the algorithm is done dealing with the previous event

On-Line versus off-line • We compare the behavior of the on-line algorithm to an optimal off-line algorithm “OPT” which is familiar with the sequence • The off-line algorithm knows the exact properties of all the events in the sequence

Absolute competitive ratio (for minimization problems) • We measure the performance of an on-line algorithm by the competitive ratio • This is the ratio between what the on-line algorithms “pays” to what the optimal off-line algorithm “pays”

Formally: let be the cost of the on-line algorithm on sequence . Let be the optimal off-line cost on then the competitive ratio is: • Calculus: supremum is similar to maximum but may be achieved in the limit

Problem 4: Caching • K-competitive caching. • Two level memory model • If a page is not in the cache , a page fault occurs. • A Paging algorithm specifies which page to evict on a fault. • Paging algorithms are online algorithms for cache replacement.

Online Paging Algorithms • Assumption: cache can hold k-pages. • CPU accesses memory thru cache. • Each request specifies a page in the memory system. • We want to minimize the page faults.

A Lower bound • Theorem: Let A be a deterministic online paging algorithm. If A is -competitive, then k. • Pf: Let S ={p_1,p_2, … , p_k+1} be a set of k+1 arbitrary memory pages. Assume w.l.g. that A and OPT initially have p_1, … , p_k in their cache. In the worst case A has a page fault on any request t.

Online Algorithm and Competitive Analysis • Theorem. LRU is k-competitive. • Proof:Let  be a subsequence of  on which LRU faults exactly k times. Let p denote page requested just before . • Case 1: LRU faults in sequence  on p. •  requests at least k+1 different pages MIN faults at least once • Case 2: LRU faults on some page, say q, at least twice in . •  requests at least k+1 different pages MIN faults at least once LRU : Least recently used Evicts page whose most recent access was earliest

Theorem. LRU is k-competitive. • Proof:Let  be a subsequence of  on which LRU faults exactly k times. Let p denote page requested just before . • Case 3: LRU does not fault on p, nor on any page more than once. • k different pages are accessed and faulted on, none of which is p • p is in MIN's cache at start of   MIN faults at least once MIN faults  1 times : 0 1 2 1 . . . p . . . LRU faults k times LRU faults k times

Universal Hashing

Dictionary Data Type • Dictionary. Given a universe U of possible elements, maintain a subset S  U so that inserting, deleting, and searching in S is efficient. • Dictionary interface. • Create(): Initialize a dictionary with S = . • Insert(u): Add element u  U to S. • Delete(u): Delete u from S, if u is currently in S. • Lookup(u): Determine whether u is in S. • Challenge. Universe U can be extremely large so defining an array of size |U| is infeasible. • Applications. File systems, databases, Google, compilers, checksums P2P networks, associative arrays, cryptography, web caching, etc.

Hashing • Hash function. h : U  { 0, 1, …, n-1 }. • Hashing. Create an array H of size n. When processing element u, access array element H[h(u)]. • Collision. When h(u) = h(v) but u  v. • A collision is expected after (n) random insertions. This phenomenon is known as the "birthday paradox." • Separate chaining: H[i] stores linked list of elements u with h(u) = i. jocularly seriously H[1] null H[2] suburban untravelled considerating H[3] browsing H[n]

Ad Hoc Hash Function • Ad hoc hash function. • Deterministic hashing. If |U|  n2, then for any fixed hash function h, there is a subset S  U of n elements that all hash to same slot. Thus, (n) time per search in worst-case. • Q. But isn't ad hoc hash function good enough in practice? int h(String s, int n) { int hash = 0; for (int i = 0; i < s.length(); i++) hash = (31 * hash) + s[i]; return hash % n; } hash function ala Java string library

Algorithmic Complexity Attacks • When can't we live with ad hoc hash function? • Obvious situations: aircraft control, nuclear reactors. • Surprising situations: denial-of-service attacks. • Real world exploits. [Crosby-Wallach 2003] • Bro server: send carefully chosen packets to DOS the server, using less bandwidth than a dial-up modem • Perl 5.8.0: insert carefully chosen strings into associative array. • Linux 2.4.20 kernel: save files with carefully chosen names. malicious adversary learns your ad hoc hash function (e.g., by reading Java API) and causes a big pile-up in a single slot that grinds performance to a halt

Hashing Performance • Idealistic hash function. Maps m elements uniformly at random to n hash slots. • Running time depends on length of chains. • Average length of chain =  = m / n. • Choose n  m  on average O(1) per insert, lookup, or delete. • Challenge. Achieve idealized randomized guarantees, but with a hash function where you can easily find items where you put them. • Approach. Use randomization in the choice of h. adversary knows the randomized algorithm you're using, but doesn't know random choices that the algorithm makes

Universal Hashing chosen uniformly at random • Universal class of hash functions. [Carter-Wegman 1980s] • For any pair of elements u, v  U, • Can select random h efficiently. • Can compute h(u) efficiently. • Ex. U = { a, b, c, d, e, f }, n = 2. H = {h1, h2} Pr h  H[h(a) = h(b)] = 1/2 Pr h  H[h(a) = h(c)] = 1Pr h  H[h(a) = h(d)] = 0. . . a b c d e f not universal h1(x) 0 1 0 1 0 1 h2(x) 0 0 0 1 1 1 a b c d e f H = {h1, h2 , h3 , h4} Pr h  H[h(a) = h(b)] = 1/2Pr h  H[h(a) = h(c)] = 1/2 Pr h  H[h(a) = h(d)] = 1/2 Pr h  H[h(a) = h(e)] = 1/2 Pr h  H[h(a) = h(f)] = 0 . . . h1(x) 0 1 0 1 0 1 universal h2(x) 0 0 0 1 1 1 h3(x) 0 0 1 0 1 1 h4(x) 1 0 0 1 1 0

Universal Hashing • Universal hashing property. Let H be a universal class of hash functions; let h  H be chosen uniformly at random from H; and letu  U. For any subset S  U of size at most n, the expected number of items in S that collide with u is at most 1. • Pf. For any element s  S, define indicator random variable Xs = 1 if h(s) = h(u) and 0 otherwise. Let X be a random variable counting the total number of collisions with u. universal(assumes u  S) linearity of expectation Xs is a 0-1 random variable

Designing a Universal Family of Hash Functions no need for randomness here • Theorem. [Chebyshev 1850] There exists a prime between n and 2n. • Modulus. Choose a prime number p  n. • Integer encoding. Identify each element u  U with a base-p integer of r digits: x = (x1, x2, …, xr). • Hash function. Let A = set of all r-digit, base-p integers. For eacha = (a1, a2, …, ar) where 0  ai < p, define • Hash function family. H = { ha : a  A }.

Designing a Universal Class of Hash Functions • Theorem. H = { ha : a  A } is a universal class of hash functions. • Pf. Let x = (x1, x2, …, xr) and y = (y1, y2, …, yr) be two distinct elements of U. We need to show that Pr[ha(x) = ha(y)]  1/n. • Since x  y, there exists an integer j such that xj  yj. • We have ha(x) = ha(y) iff • Can assume a was chosen uniformly at random by first selecting all coordinates ai where i  j, then selecting aj at random. Thus, we can assume ai is fixed for all coordinates i  j. • Since p is prime, aj z = m mod p has at most one solution among p possibilities. • Thus Pr[ha(x) = ha(y)] = 1/p  1/n. ▪ see lemma on next slide

Number Theory Facts • Fact. Let p be prime, and let z  0 mod p. Then z = m mod p has at most one solution 0   < p. • Pf. • Suppose  and  are two different solutions. • Then ( - )z = 0 mod p; hence ( - )z is divisible by p. • Since z  0 mod p, we know that z is not divisible by p;it follows that ( - ) is divisible by p. • This implies  = . ▪ • Bonus fact. Can replace "at most one" with "exactly one" in above fact. • Pf idea. Euclid's algorithm.

Advanced Algorithms