90 likes | 188 Views
Uniform algorithms for deterministic construction of efficient dictionaries. Milan Ružić IT University of Copenhagen Faculty of Mathematics University of Belgrade ESA 2004 / ARCO 2005 presentation. The dictionary problem. How to store a set S U and answer inquires about membership:
E N D
Uniform algorithms for deterministic construction of efficient dictionaries Milan Ružić IT University of Copenhagen Faculty of Mathematics University of Belgrade ESA 2004 / ARCO 2005 presentation
The dictionary problem • How to store a set S Uand answer inquires about membership: “is xS?”. • In the dynamic dictionary problem, Smay change over time. • Conditions: • Compute on a unit-cost RAM with word length wand a standard instruction set, including multiplication and division. • Finite universe U {0,1}w . • Use space linear in n | S | .
Randomized solutions • Started with a static dictionary with O(n) expected construction time, using (nw)random bits [Fredman, Komolós, Szmerédi ‘82]. • Reached a dynamic dictionary with: • Constant search time. • Constant update time with probability O(1 – n-c). • Use of only O(log n + log w) random bits. [Dietzfelbinger et al ’92] • However, what if: • random bits are not easily available, or • performance without a guarantee is unacceptable?
The family of hash functions • Viewing the problem in a continuous setting - HR . • A sufficient condition for avoiding collisions :
The set of good parameters • The set of multipliers which generate less than m collisions on the set ofsdifferences has the measure of at least • We can calculate the measures with numbers of bounded precision. • The set of “good” parameters contains sufficiently large intervals – that is, there are “good” multipliers which can be represented by a constant number of machine words.
Finding a good function • Problem: Given a set of s differences, deterministically find a multiplier a which produces less than m colliding differences. • Not all differences need to be explicitly stored in memory. • We use bit by bit construction – sometimes several consecutive bits are set at once. • Choosing a bit is equivalent to choosing a half of a working interval. • Key observation: sets with relatively small support intervals are insignificant to current choice.
Three classes of differences • The recurrence for measure estimates: 1(p+1) + 2(p+1) + E(p+1) (p) + E(p) • Several bits are chosen at once when Dmid. • O(w) term represents the total cost of finding the leftmost bits of keys.
Reducing the construction time • We employ multi-level hashing scheme. The number of levels can be set by adjusting the parameters m and s. • The structure of the set of differences: • In the case of O(1) lookup time we set nkn, m 4n and r n. • Note on evaluation: When input consists of multi-word keys, full multiplication is usually not necessary.