140 likes | 382 Views
The TRIE. Amihood Amir . Labeled Trees. Edge Labeled Tree: T=(V,E,ℓ) Where ℓ:V Σ , Σ is the alphabet. Example: Σ ={ A,B,C }. A. A. B. C. B. Path String.
E N D
The TRIE Amihood Amir
Labeled Trees Edge Labeled Tree: T=(V,E,ℓ) Where ℓ:VΣ, Σis the alphabet. Example: Σ={A,B,C} A A B C B
Path String A pathv0,…,viin an edge labeled tree defines the path string ℓ(v0),…,ℓ(vi)of the labels of the vertices on the path. Example: Path: A A B C B Path string:AAB
Root Paths A root path v0,…,viin an edge labeled tree is a path that starts at the root, i.e. v0 is the root of the tree. Example: Root Path: A Not Root Path: A B C B
Longest Common Prefix Let S=S[1],…,S[m] and T=T[1],…,T[n] be two strings over alphabet Σ. The Longest Common Prefix (LCP) of S and T is the string a[1],…,a[k] such that a[i]=S[i]=T[i], i=1,…,k and such that S[k+1]≠T[k+1]. Example:The LCP of ABCAABCDABCCC and ABCAABCDACACC is: ABCAABCDA
reTRIEval We define a Trie Tof n strings S1 = S1[1],…,S1[m1] S2 = S2[1],…,S2[m2] … Sn = Sn[1],…,Sn[mn] over alphabetΣby induction on n as follows: Let Λ,$єΣ.
reTRIEval – base case Λ For n=1: S1 = S1[1],…,S1[m1] The trie is: S1[1] . . . S1[m1] $
reTRIEval – inductive case (1) Assume we have defined he trie Tn of n strings. The trieTn+1 of the n+1 strings: S1 = S1[1],…,S1[m1] S2 = S2[1],…,S2[m2] … Sn = Sn[1],…,Sn[mn] Sn+1 = Sn+1[1],…,Sn+1[mn+1] Is defined as follows:
reTRIEval - inductive case (2) Let Tn be the trie of the n strings S1 = S1[1],…,S1[m1] S2 = S2[1],…,S2[m2] … Sn = Sn[1],…,Sn[mn] And let a[1],…a[k] be the longest LCP(Sn+1,Si), i=1,…,n.
reTRIEval – inductive case (3) Concatenate the path: To the node where the root path of string a[1],…,a[k] ends. The resulting tree is Tn+1. Sn+1[k+1] . . . Sn+1[mn+1] $
Trie construction Example ABCABC ABB ABBA ABCB BBAB BABC
Trie construction Time For a Trie Tof n strings: S1 = S1[1],…,S1[m1] S2 = S2[1],…,S2[m2] … Sn = Sn[1],…,Sn[mn] Over fixed finite alphabet Σ:
Trie Insertion, Lookup, Deletion Time For string:S = S[1],…,S[m] Over fixed finite alphabet Σ: O(m) Over ubounded alphabet Σ: O(m log n)
How do we deal with numbers? An n-digit number is the string composed of the digits. Insertion/deletion/lookup time of number m:O(log m) Compare with AVL: O(log n)