270 likes | 300 Views
Explore a 10-way trie structure for storing Social Security Numbers efficiently, reducing memory accesses during searches. Learn about memory optimization and branch node structures.
E N D
0 1 2 3 4 5 6 7 8 9 Higher Order Tries • Key = Social Security Number. • 441-12-1135 • 9 decimal digits. • 10-way trie (order 10 trie). Height<= 10.
Social Security Trie • 10-way trie • Height <= 10. • Search=> <= 9 branches on digits plus 1 compare. • 100-way trie • 441-12-1135 • Height <= 6. • Search=> <= 5 branches on digits plus 1 compare.
Social Security Trie • 109-way trie • Height <= 2. • Search=> <= 1 branch on a digit plus 1 compare.
Memory Accesses • During a search, we can compute needed field of a branch node. • Access only that field. T 0 1 2 3 4 5 6 7 8 9 key = 46…
Memory Accesses • #memory accesses = 1 per encountered branch node + those needed for an element node. T 0 1 2 3 4 5 6 7 8 9 key = 46…
Binary Search Trees • Red-black tree • Height <= 2log2109 ~ 60. • Search=> up to 60 compares of 9 digit numbers and up to 60 memory accesses. • AVL tree • Height <= 1.44log2109 ~ 40. • Search=> up to 40 compares of 9 digit numbers and up to 40 memory accesses. • Best binary tree. • Height = log2109 ~ 30.
Higher Order Search Trees • height can be <log2 n • #nodes accessed is reduced • but cache misses/node increases as node size increases • worst-case #compares remains >= log2 n 10 30 50
0 1 2 3 4 5 6 7 8 9 char# #ptr Compressed Social Security Trie Branch Node Structure • char#= character/digit used for branching. • Equivalent to bit# field of compressed binary trie. • #ptr = # of nonnull pointers in the node.
2 5 3 012345678 015234567 Insert 012345678 Insert 012345678. Insert 015234567. Null pointer fields not shown.
2 5 3 012345678 015234567 Insert Insert 015231671.
2 5 3 1 4 6 012345678 015231671 015234567 Insert Insert 079864231.
1 7 2 2 5 3 079864231 1 4 6 012345678 015231671 015234567 Insert Insert 012345618.
2 2 5 3 079864231 1 4 1 7 8 6 012345678 012345618 015231671 015234567 Insert 1 7 Insert 011917352.
Insert 1 7 2 1 2 5 3 079864231 011917352 1 4 1 7 8 6 012345678 012345618 015231671 015234567
1 7 2 1 2 5 3 079864231 011917352 1 4 1 7 8 6 012345678 012345618 015231671 015234567 Delete Delete 011917352.
Delete 1 7 2 2 5 3 079864231 1 4 1 7 8 6 012345678 012345618 015231671 015234567 Delete 012345678.
Delete 1 7 2 2 5 3 079864231 1 4 6 012345618 015231671 015234567 Delete 015231671.
Delete 1 7 2 2 5 3 079864231 012345618 015234567
2 5 3 1 4 6 012345678 015231671 015234567 Variable Length Keys Insert 0123. Problem arises only when one key is a (proper) prefix of another.
2 5 3 Insert 0123. 4 # 1 4 6 5 012345678 0123 015231671 015234567 Variable Length Keys End of key character (#) not shown.
Variable Length Keys One trie per length. T[] 1 2 3 4 5 6 7 8 9 10 Array of tries. Hashtable of tries.
Tries With Edge Information • Add a new field (element) to each branch node. • New field points to any one of the element nodes in the subtree. • Use this pointer on way down to figure out skipped-over characters.
2 5 3 5 4 # 1 4 6 012345678 0123 015231671 015234567 Example element field shown in blue.
Etc. • Expected height of an order m trie is ~logmn. • Limit height to h (say 6). Level h branch nodes point to buckets that employ some other search structure for all keys in subtrie.
Etc. • Switch from trie scheme to simple array when number of pairs in subtrie becomes <= s (say s =6). • Expected # of branch nodes for an order m trie when n is large and m and s are small is n/(s ln m). • Sample digits from right to left (instead of from left to right) or using a pseudorandom number generator so as to reduce trie height.
Web Resource • See Web writeup for additional applications of tries. • Prefix search. • Automatic command (or phone number or URL) completion. • LZW compression.
Web Resource • See Web writeup for alternative node structures for tries. • Array • Chain • Binary search tree • Hash table