200 likes | 345 Views
Intro to Computer Algorithms Lecture 16. Phillip G. Bradford Computer Science University of Alabama. Announcements. Advisory Board’s Industrial Talk Series http://www.cs.ua.edu/9IndustrialSeries.shtm Next Research Colloquia: Prof. Nael Abu-Ghazaleh 10-Nov @ 11:00am
E N D
Intro to Computer Algorithms Lecture 16 Phillip G. Bradford Computer Science University of Alabama
Announcements • Advisory Board’s Industrial Talk Series • http://www.cs.ua.edu/9IndustrialSeries.shtm • Next Research Colloquia: • Prof. Nael Abu-Ghazaleh • 10-Nov @ 11:00am • “Active Routing in Mobile Ad Hoc Networks”
CS Story Time • Prof. Jones’ research group • See http://cs.ua.edu/StoryHourSlide.pdf
Next Midterm • Tuesday before Thanksgiving !
Outline • The Dictionary Problem and Hashing • Open Hashing with Chaining • Hashing with Open Addressing • Double Hashing • Introduction to Dynamic Programming
The Dictionary Problem • Motivation • Fundamental Operations • Insert • Search • Delete • Desire Quick Lookup • Large Records & We Focus on the Key
Hash Tables • Two desired properties • Uniform distribution in the table • Let K be the key to hash and h is the hash function • Table of size m and for any i in {1,…,m}, then • P[i = h(K)] = 1/m • The hash function h is easy to compute
Hash Tables • What are some candidates for h? • h(x) = (ax + b) mod m • For appropriate (random?) a and b and m=p a prime • Let K=c0c1…cn, characters • h(x): • R0; // and appropriate C • For i1 to n do • R(R*C + ord(ci)) mod m • Endfor • Return R.
Hash Tables • These functions all resemble linear-congruential pseudo-random number generators! • Why is this not surprising? • Consider two different keys Ki and Kj • A hash collision is when h(Ki) = h(Kj) • Two approaches to resolve this situation
Hash Collision Resolution Methods • Separate Chaining • Also called Open Hashing • Given n keys, how long can chains be? • Worst case? • Expected length? • What is the expectation computed over? • Load Factor a = n/m • In an average chain, when might we expect to find an element? • Successful Searches: about 1+ a/2 comparisons • Unsuccessful searches: about a comparisons
Uniform Inputs: Balanced Chains Chains: NULL NULL NULL NULL
Hash Collision Resolution Methods • Linear Probing • Inserting a key K • Start at h(K) and if it is full, then try h(K)+1, h(K)+2, etc., until one of these is empty • How do we know that the hash table is full? • What can we track? • U is about (1+1/(1-a)2)/2 • S is about (1+1/(1-a))/2
Hash Collision Resolution Methods • Double Hashing • Two pseudo-random functions • h1(x) and h2(x) • Use h1(x) to find the `starting place’ • If done, then stop • Use h2(x) to determine how much to `hop’ from there • L h1(x) ; try position H[L] first • Q h2(x) • Otherwise, try positions H[L+ iQ mod m] • For i1 to floor( m/n )
Hash Collision Resolution Methods • U is about 1+1/(1-a) • S is about 1/(1-a)
A Basic Fact • CLRS2001: I{A} = 1 if A happens • And I{A} = 0 if A does not happen • Coin Example XH = I{X = H}. • XH = 1 if coin is heads • XH = 0 if coin is tails • P[XH] = ½ • But also, E[XH] = ½
A Basic Fact • Lemma [See CLRS2001] Take A in sample space S, if XA=I{A}, • then E[XA] = P[A]. • Proof: • E[XA] = 1*P[A] + 0*P[Not(A)] = P[A].
A Look “Under the Hood” • Hashing by chaining has average case O(1+a) probes for a successful search. • Proof Sketch (full details at the board) • Assume P[i=h(K)]=1/m • Therefore by the last Lemma E[i=h(K)]=1/m
A Look “Under the Hood” • The number of probes in a successful search • E[1/n S(1+SXi,j)] • Outer sum i1 to n • Inner sum ji+1 to n • Why? • Rest at the board… • Double Hashing on Thursday…
Introduction to Dynamic Programming • Richard Bellman in the 1950s • Optimizing multistage decision processes • Problems generally have overlapping subproblems
Dynamic Programming Example • Pascal’s Triangle • (a+b)n • Classic Relation • C(n,k) = C(n-1,k-1) + C(n-1,k) • Base conditions: C(n,0)=C(n,n)=1.