120 likes | 153 Views
Tries. When searching for the name “Smith” in a phone book, we first locate the group of names starting with “S”, then within those we search for “m”, etc. Idea: Perform a search based on a prefix of the key, rather than a comparison of the whole key.
E N D
Tries • When searching for the name “Smith” in a phone book, we first locate the group of names starting with “S”, then within those we search for “m”, etc. • Idea: Perform a search based on a prefix of the key, rather than a comparison of the whole key. • The branching is determined by a partial key value. • Each node branches out to as many nodes as there are characters.
Tries • A Trie is a multi-way tree for storing strings in which • there is one node for every common prefix • the strings are stored in the leaves • The order of the trie is m, where m is the size of the alphabet.
Tries • Example: The set of strings {"mat", "mad, "am", "bad"} stored in a BST and in a trie m mat a b am mad m a a am bad d t d bad mad mat
Tries • Advantage: • The height of the tree depends on the length of the keys • A trie can be used to store very large sets but the height (and therefore the search time) is very short. • Applications: • spell checker • web search engine (search by index word) • network router (search by IP address)
# A P S T # A P S T # A P S T # A P S T TAP # A P S T # A P S T A APT AT PS internal node # A P S T SAP SAT PAT leaf PASS PAST Example (slightly compressed...) Alphabet : {A, P, S, T} Words: {A, APT, AT, PAT, PASS, PAST, PS, SAP, SAT, TAP}
Tries • Observation: Many nodes have few non-null pointers. • We would like to save space • Idea #1: Briandais tree • Store only the pointers that are used • Maintain all siblings in a linked list • Disadvantages: • the list needs to be traversed linearly • we still use extra space for the pointers
Tries • Observation: Many nodes have few non-null pointers. • We would like to save space • Idea #2: Compressed Trie • Interleave the arrays in the nodes • Minor disadvantage : • an unsuccessful search may take more steps to end.
Compressed Tries # A P S T node 1 # A P S T # A P S T node 3 node 2 A AT PA PS APT node 1node 2node 3 p1 stands for pointer to node 1 p2 p3 A APT AT PA PS
PATRICIA trees • Observation: • Sometimes the actual set of keys is a small subset of the potential set of keys. • This may result in a large number of nodes that only have one descendant. • We would like to save some space • Idea: • Make the tree more compact by collapsing long chains. • The resulting tree is called a PATRICIA tree * Practical Algorithm To Retrieve Information Coded In Alphanumeric
PATRICIA trees • Idea: • Collapse chains of nodes that have only one child • For each branch indicate how many characters should be skipped (i.e. what the length of the collapsed chain is)
# A P S T 0 0 3 2 0 # A P S T # A P S T 0 0 0 0 0 0 0 0 0 0 # A P S T # A P S T PASS PAST SAP SAT # A P S T SAP SAT PASS PAST Patricia tree example