450 likes | 601 Views
Balanced Families of Perfect Hash Functions and Their Applications. Lecturer: Ofer Rothschild Noga Alon Shai Gutner Tel Aviv University, 2007. Families of perfect hash functions. Functions from [n] to [k] For every S ⊆ [n], |S| = k: Standard notion: At least one 1-1 function New notion:
E N D
Balanced Families of Perfect Hash Functions and Their Applications Lecturer: Ofer Rothschild Noga Alon Shai Gutner Tel Aviv University, 2007
Families of perfect hash functions • Functions from [n] to [k] • For every S ⊆ [n], |S| = k: • Standard notion: • At least one 1-1 function • New notion: • About the same number of 1-1 functions Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Motivation • Approximation of counting problems • The number of times that: • Simple cycles of size k • Simple paths of size k • Some fixed subgraph appear in a graph • Also in weighted graphs Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Previous results and Background • k-restriction problems • Color-coding [Alon et al. 1995] • Computational biology Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Previous results - Explicit constructions of perfect hash functions • [Alon et al. 1995]: • Size: 2O(k) log n • Best known explicit construction: • Size: ekkO(log k)log n • Lower bound [Naor et al. 1995]: • Ω(eklog n /√k ) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Previous results –Finding and counting paths and cycles • Path: 2O(k)|E|, 2O(k)|V| expected time • Cycle: 2O(k)|V||E|, 2O(k)|V|ω expected time • Derandomization: extra log|V| factor • Counting: • K ≤ 7: O(|V|ω) • Exactly: #W[1]-complete • Randomized: tractable Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Previous results –Splitters • [Naor et al. 1995] Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Results • A δ-Balanced (n,k)-family of perfect hash functions: (1 < δ ≤ 2) • Non-constructive upper bound • Explicit construction: • Size: 2O(k log log k)(δ −1)−O(log k) log n • Time: 2O(k log log k)(δ − 1)−O(log k)n log n + (δ − 1)−O(k/ log k) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Results - Applications • Counting simple paths of length k-1: • 2O(k log log k)(δ − 1)−O(log k)|E| log |V| + (δ − 1)−O(k/ log k) • Counting simple cycles of length k: • 2O(k log log k)(δ − 1)−O(log k)|E||V| log |V| + (δ − 1)−O(k/ log k) • Polynomial if k ≤ O(logn/logloglogn) and δ is fixed Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Definitions Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Hashing Huge universe U Here U=[n]:={1,2,…,n} Hash function h Collisions 0 1 m-1 Illustrations from Uri Zwick 2008 Hash table Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
0 1 m-1 i Hashing with chaining Illustrations from Uri Zwick 2008 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Perfect hash functions [n] S Perfect hashing:No collisions Illustrations from Uri Zwick 2008 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Families of perfect hash functions [n] S T Usually this array will be [k] Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
δ-balanced (n,k)-familyδ-balanced (n,k,l)-splitter [n] S f1f2 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Definitions • For a group of functions and • inj(S)=the number of 1-1 functions on S • split(S)=the number of functions that divide S almost equally: Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Definition 2.1 • δ-balanced (n,k)-family: • Functions from [n] to [k] • The number of 1-1 functions is almost equal for all Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Definition 2.2 • δ-balanced (n,k,l)-splitter: • Function from [n] to [l] • The number of functions that divide S to almost equal sets is almost equal for all Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Theorems Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Lemma 2.3 • For any k < l, let H be an explicit δ-balanced (n, k, l)-splitter of size N and let G be an explicit γ-balanced (l, k)-family of perfect hash functions of size M. We can use H and G to get an explicit δγ-balanced (n, k)-family of perfect hash functions of size NM. • (n,k,l)-Splitter * (l,k)-Family = (n,k)-Family • Proof: Compose the functions. • Lemma 2.4: A similar lemma for k>l Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Probabilistic constructions • Theorem 3.3. For any 1 < δ ≤ 2, there exists a δ-balanced (n, k)-family of perfect hash functions of size: • Proof plan: • p=k!/kk • Take M random functions • Prove that the probability that M is such a family Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Brave Sir Robbin ran away • Chernoff: Let Y be the sum of mutually independent indicator random variables, μ = E[Y ]. For all 1 ≤ δ ≤ 2, Pr[μδ ≤ Y ≤ δμ] > 1 − 2e−((δ−1)^2)μ/8. • Robbins: For every integer n ≥ 1, √(2π)nn+1/2e−n+1/(12n+1) < n! < √(2π)nn+1/2e−n+1/(12n). • E[inj(S)]=pM • Therefore, the chance that for at least one set S, the number of 1-1 functions will not be as needed is at most: QED Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Probabilistic constructions • Similar Theorems for splitters Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Explicit constructions • Theorem 4.1. For any fixed 1 < δ ≤ 2, a δ-balanced (n,k)-family of perfect hash functions of size: can be constructed deterministically within time: Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Proof begins • p=k!/kk Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
For any choise of M functions and a set S: • XS,i := Is fi 1-1 on S? • XS =How many functions are 1-1 on S? Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
What do you expect? • Let’s show that even if we take M independent random functions – usually it will be OK • Later we shall improve it by determinism Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Let’s bound • 1+u<=eu • e-u <=1-u+u2/2 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Are we there yet? Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
The deterministic construction • This is what we expect in random. • That will be our upper bound for the deterministic construction • We shall use a greedy algorithm • We shall find the functions in this order: • for(i=1; i<=M; i++) { //f1, f2, …, fM for(j=1; j<=n; j++) { //fi(1), fi(2), …, fi(n) find fi(j) }} Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Every step we shall find the value that will give the minimal conditional expectancy • The conditional expectancy can be computed each step in time • We start with • Every step the conditional expectancy decreases • Particularly at the end, Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
What is this Φ again? • In our case • In particular, for every S: • And with simple manipulations we get to: • pM/δ≤ XS≤δpM QED (Theorem 4.1) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
For example Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
We are only interested in: 1? 2? 5? E+=e0=1 3?4? E+=1/5*e-λ +4/5*e0 Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Some more theorems • . is too much. • We shall use (4.1) only as a part of the construction • Theorem 4.2. For any 1 < δ ≤ 2, a δ-balanced (n, k, ⌈2k2/(δ−1) ⌉)-splitter of size: kO(1) log n/(δ−1)O(1) can be constructed in time kO(1)n log n/(δ−1)O(1) . • (Using error correcting codes) Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Some more theorems • Theorem 4.3. • For any k ≥ l and 1 < δ ≤ 2, a δ-balanced (n, k, l)-splitter of size: 2O(k log l−log(δ−1))log n can be constructed in time: 2O(k log l−log(δ−1))n log n. • (Using almost k-wise independence) • Corollary 4.4. • For any fixed c > 0, a (1 + c−k)-balanced (n, k, 2)-splitter of • size 2O(k) log n can be constructed in time 2O(k)n log n. Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Theorem 4.5 • For 1 < δ ≤ 2, a δ-balanced (n, k)-family of perfect hash functions of size: 2O(k log log k)/(δ−1)O(log k) log n can be constructed in time: 2O(k log log k) n log n /(δ−1)O(log k) + (δ −1)−O(k/ log k). • In particular, for any fixed 1 < δ ≤ 2, the size is 2O(k log log k) log n and the time is 2O(k log log k)n log n. • Proof: Using all the theorems we construct balanced families and splitters, and then we compose them using the lemmas. Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Applications Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Applications • Counting simple paths of length k-1: • 2O(k log log k)(δ − 1)−O(log k)|E| log |V| + (δ − 1)−O(k/ log k) • Counting simple cycles of length k: • 2O(k log log k)(δ − 1)−O(log k)|E||V| log |V| + (δ − 1)−O(k/ log k) • Polynomial if k ≤ O(logn/logloglogn) and δ is fixed Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
1 2 1 2 3 3 2 3 2 How many paths must a man walk down before you call him a man? • Build a δ-balanced (|V|, k)-family of perfect hash functions using theorem 4.5 – these are the colourings • Compute the number of colourful paths for every S • T/δ*(number of paths) ≤ Σ{for all S} (colourful paths) ≤ δT*(number of paths) • Divide by T Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008 Illustration from Shirly Zilkha, 2008
Concluding remarks Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
Other counting problems • The constant T can be easily computed Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
What’s next? • Can we decrease the balanced family to size 2O(k)logn? • What about k=Θ(logn)? Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008
References • NogaAlon and ShaiGutner.Balanced families of perfect hash functions and their applications, In ICALP, volume 4596 of Lecture Notes in Computer Science, pages 435-446. Springer, 2007, http://courses.cs.tau.ac.il/combsem/09a/combsem.html • A lecture on hashing from the course in Data Structures, Uri Zwick, Tel Aviv University, 2007, http://www.cs.tau.ac.il/courses/0368-2158/08a/ • ShirlyZilkha’s presentation on Color Coding, 2008, http://courses.cs.tau.ac.il/combsem/09a/combsem.html Balanced Families of Perfect Hash Functions Slides: Ofer Rothschild, 2008