580 likes | 800 Views
DATA STRUCTURE. Instructor: Dai Min Office: XNA602 Fall 2006. CHAPTER 3 Trees & Binary Trees. Binary Tree Binary tree theorem Binary tree Implementation Binary tree traversals Huffman Tree What is Huffman tree Huffman Code
E N D
DATA STRUCTURE Instructor: Dai Min Office: XNA602 Fall 2006
CHAPTER 3 Trees & Binary Trees • Binary Tree • Binary tree theorem • Binary tree Implementation • Binary tree traversals • Huffman Tree • What is Huffman tree • Huffman Code • Trees & Forest • Tree Implementation & traversals • Tree Converting to a binary tree
3.1 Terminology 1)Definition of Tree • A tree is a finite set of one or more nodes such that: • There is a specially designated node called the root. • The remaining nodes are partitioned into n>=0 disjoint sets T1, ..., Tn, where each of these sets is a tree. • We call T1, ..., Tn the subtrees of the root. • A forest is a set of n >= 0 disjoint trees
A B C D E F G H I J K L M root subtrees
2) Terminology • The degree of a node is the number of subtreesof the node. • The degree of A is 3; the degree of C is 1. • The node with degree 0 is a leaf or terminal node. Otherwise, the node is nonterminal node or internal node. • A node that has subtrees is the parent of the roots of the subtrees. The roots of these subtrees are the children of the node. • Children of the same parent are siblings. • The ancestors of a node are all the nodes along the path from the root to the node.
A B C D E F G H I J K L M • Level and Depth Level 1 2 3 4 node (13) degree of a node leaf (terminal) nonterminal parent children sibling degree of a tree (3) ancestor level of a node height of a tree (4)
3.2 Binary Trees 3.2.1 Definition • A binary tree is a finite set of nodes that is either empty or consists of a root together with two binary trees, called the left subtree and the right subtree, which are disjoint from each other and from the root.
Binary Tree Example Notation: Node, children, edge, parent, ancestor, descendant, depth,leaf node, internal node, subtree.
ADT of Binary Trees ADT Binary_Tree(abbreviated BinTree) { Objects: a finite set of nodes either empty or consisting of a root node, left binary tree, and right binary tree. Function: for all bt, bt1, bt2 BinTree, item element Create(): IsEmpty(bt) …… } ADT BinTree
3.2.2 TheoremOf Binary Trees 1) Maximum Number of Nodes in BT • The maximum number of nodes on level i of a binary tree is 2i-1, i>=1. • The maximum nubmer of nodes in a binary tree of depth k is 2k-1, k>=1. Prove by induction.
2) Relations between Number of Leaf Nodes and Nodes of Degree 2 • For any nonempty binary tree, T, if n0 is the number of leaf nodes and n2 the number of nodes of degree 2, then n0=n2+1 proof: Let n and B denote the total number of nodes & branches in T. Let n0, n1, n2 represent the nodes with no children, single child, and two children respectively. n= n0+n1+n2, B+1=n, B=n1+2n2 ==> n1+2n2+1= n, n1+2n2+1= n0+n1+n2 ==> n0=n2+1
3) Full BT VS Complete BT • A full binary tree of depth k is a binary tree of depth k having 2k -1 nodes, k>=0. • Each node is either a leaf or internal node with exactly two non-empty children. • A binary tree with n nodes and depth k is complete iff its nodes correspond to the nodes numbered from 1 to n in the full binary tree of depth k. • If the height of the tree is k, then all leaves except possibly level k are completely full. The bottom level has all nodes to the left side.
Full Binary Tree Theorem (1) • If a complete binary tree with n nodes, then it’s depth is log2n +1 proof: If thedepth of the full binary tree is h, then 2h-1 - 1 < n 2h - 1 2h-1n < 2h h-1 log2 n < h h is a integer, so h-1= log2n h= log2n +1
Full Binary Tree Theorem (2) • If a complete binary tree with n nodes is represented sequentially, then for any node with index i, 1<=i<=n, we have: • parent(i) is at i/2 if i<>1. If i=1, i is at the root and has no parent. • left_child(i) is at 2i if 2i<=n. If 2i>n, then i has no left child. • right_child(i) is at 2i+1 if 2i +1 <=n. If 2i +1 >n, then i has no right child.
3.2.3 Binary Tree Implementation 1) Array Implementation • We can embed the nodes of a binary tree into a one-dimensional array by defining a relationship between the position of each parent node and the position of its children. • left_child of node i is 2* i • right_child of node i is 2* i+1 • parent of node i is i/2 (integer division)
Example • How much space is required for the array to represent a tree of depth h? 2h – 1
lchild data rchild A D A B C G E F B C D E F G 2) Linked Implementation • Bintree Linked List • Each node contains three parts: a data field and two pointer to its left child and right child. • Accessed via a pointer to the root node of the tree. ^ ^ ^ ^ ^ ^ ^ ^
Binary tree node declarations typedef XXX TElemtype; typedef struct BiTnode { Telemtype data; struct BiTnode *lchild, *rchild; /*Point to left child & right child*/ } BiTnode, *BiTree; BiTree root; Bitnode *root
3.3 Binary TreesTraversals • Any process for visiting the nodes in some order and listing every node exactly once is called a traversal. • Let L,D and R stand for moving left, visiting the node, and moving right.There are six possible combinations of traversal. • DLR, DRL, LDR, LRD, RDL, RLD • Adopt convention that we traverse left before right, only 3 traversals remain • DLR, LDR, LRD • Preorder, Inorder, Postorder
1) Traversals • Preorder traversal: Visit each node before visiting its children. • Inorder traversal: Visit the left subtree, then the node, then the right subtree. • Postorder traversal: Visit each node after visiting its children.
Example: Preorder : ABHFDECKG Inorder : HBDFAEKCG Postorder : HDFBKGCEA
2) Algorithms • Preorder Traversal • If it is Null,then return; • else • Visiting the root node (D); • Preorder traverse its left subtree (L); • Preorder traverse its right subtree(R)。 - + a * b - c d / e f
Void PreOrder (BiTree bt) • { /* bt is the root of the binary tree*/ • if ( bt == NULL) return; • printf("%d", bt->data); • /*visiting the root*/ • PreOrder(bt->lchild); • /*preorder traverse the left substree */ • PreOrder(bt->rchild); • /* preorder traverse the right substree */ • }
Inorder Traversal • If it is Null,then return; • else • Inorder traverse its left subtree (L); • Visiting the root node (D); • Inorder traverse its right subtree(R)。 a + b * c - d - e / f
Void InOrder(BiTree bt) { /* bt is the root of the binary tree*/ if (bt == NULL) return; InOrder(bt->lchild); /*Inorder traverse the left substree */ printf ("%d",bt->data); /*visiting the root*/ InOrder(bt->rchild); /*Inorder traverse the right substree */ }
Postorder Traversal • If it is Null,then return; • else • Inorder traverse its left subtree (L); • Inorder traverse its right subtree(R)。 • Visiting the root node (D); a b c d - * + e f / -
Void PostOrder(BiTree bt) { /* bt is the root of the binary tree*/ if (bt == NULL) return; PostOrder(bt->lchild); /*Postorder traverse the left substree */ PostOrder(bt->rchild); /*Postorder traverse the right substree */ printf ("%d",bt->data); /*visiting the root*/ }
3) Create a binary tree • Create a binary tree by its preorder list Example :ABC##DE#G##F### BiTree CreateBiTree() { char ch; BiTree T; scanf(ch); if (ch==‘#’) T=NULL; else { if ((!t=(BiTNode *)malloc(sizeof(BiTNode)))) exit(overflow); T->data=ch; T->lchild=CreateBiTree( ); T->rchild=CreateBiTree(); return T; } }
3.4 Huffman Tree & Applications 3.4.1 Huffman Trees 1) What is a Huffman tree? • Path Length: A path from node n1 to nk in a tree is defined as a sequence of nodes n1,n2…nk such that ni is the parent of ni+1 for 1 i<k. The length of this path is the number of edges on the path.
c 2 4 d a b 7 5 • Weighted Path Length, WPL • wi is the weight of the i-th leaf; • li is the path length from root to the i-th leaf WPL=7*3+5*3+2*1+4*2=46 • Huffman Tree: the binary tree with the minimum weighted path length.
2) How to construct a Huffman tree? • Initialize a sets of n one-node binary trees F = { T1, T2, …, Tn} for each of the given weights w1,w2,...,wn. • Do the following n - 1 times: • Find two trees T' and T'' in F with roots of minimal weight w' and w''. • Replace these two trees with a binary tree whose root has weight w' + w'', and whose subtrees are T' and T''.
6 7 8 14 3 11 14 20 20 20 9 9 9 9 9 11 11 11 3 3 6 6 11 7 8 14 3 3 3 6 6 6 29 29 14 14 15 15 15 15 49 7 7 7 7 8 8 8 8 14 11 Example: W = { 6, 7, 8, 14, 3, 11 }
3.4.2 Applications • 1) Huffman Coding character——> Binary bits(codes)——> character Coding Decoding • Fixed-length coding • Example 1: ASCII codes —— 8 bits per character. • Example 2: if a file contains 100000 characters, and the character is come from C={a,b,c,d,e,f}, then Coding: a-000,b-001,…… the total codes length is 300000 bits. • Advantage: Easy Decoding; • Disadvantage: Codes are expensive (too long to transmit.
Variable-length coding: • Taking advantage of relative frequency of characters to save space. • Using shorter codes for more frequently occurring characters • Advantage: increase the efficiency of codes • Disadvantage: ambiguous. 【Example】A coding: a - 01, b - 010, c - 101, d - 100, e - 00, f - 011 Codes: 010100
Prefix coding: A set of codes is said to meet the prefix property if no code in the set is the prefix of another. • How to get the prefix codes? • Coding the leaves in a binary tree. • Left branch associated with 0 • Right branch associated with 1 • The codes of a leaf is the bit string labeling the path from root to the leaf • The codes are prefix code because each character is associated with a leaf AND there is a unique path from root to leaf.
How to minimize the expected codes length? • The expected length of the codes is w1l1+ w2l2+ ….+ wnln • wiis a measure of the character Ci 's frequency of occurrence. • l1, l2, ... , lnare the lengths of the codes for characters C1, C2, ... , Cn, respectively. • The optimal minimized expected code length —— Huffman code
Huffman Coding • Constructing a Huffman tree using the weights w1,w2,...,wn associated with the characters C1, C2,...,Cn. • Labeling the branch with to 0 and 1,respectively. • The code for character Ci is the bit string labeling the path from root to leaf Ci in the Huffman tree. • Huffman codes provide minimum expected length because the binary tree is built from the bottom-up. • The lowest weighted characters are placed in the bottom of the tree first. The more frequently occurring characters are folded into the top later. • Less frequently occurring characters have longer paths from root to leaf. More frequently occurring characters have shorter paths from root to leaf.
Algorithm • # define m 2*n-1 • typedef struct • { int weight; • int lchild,rchild,parent; • } HTNode,*Huffmantree; • HTNode HT[m]; • Huffmantree HT;
① Initialize HT (lchild,rchild,parent,weight 0 ) • ② read w1,w2,…wn and store to HT[1]~HT[n] • ③ Do the following n-1 times: (i=n+1~2n-1) • Find two trees with root of minimal weight w' and w'‘. S1,S2 is the position of root nodes in HT, respectively. • Combine HT[S1] and HT[S2] and the result is stored in HT[ i ] HT[ i ].weight = HT[S1].weight + HT[S2].weight HT[ i ].lchild = S1 HT[ i ].rchild = S2 HT[S1].parent = i HT[S2].parent = i
B E D 等级 A C 60~69 70~79 80~89 90~100 分数段 0~59 0.30 比例 0.05 0.15 0.40 0.10 Y a<60 Y Y 70a<80 a<80 N E N N C Y C a<70 Y a<70 Y Y 80a<90 a<90 N E Y a<60 N D N N B C B Y N D a<80 Y 60a<70 E A D N E A C N D Y a<90 Y a<60 N N B E A A 2) Huffman Tree in determinant problem If there are 10000 input data, then For (a),31500 comparison; For (c),22000 comparison.
3.5 General Trees 3.5.1 Trees Implementation 1) Parent Pointer Implementation Advantage: lookup parent easily
child1 child3 child2 childd data child2 degree childde data child1 2) Lists of Children Form 1: Problem: Null pointers waste space Form 2: Problem: nodes structure is inconsistent Solution: Lists of Children
a data fc b c 0 ^ a 1 f d e b ^ 2 c ^ 3 g h i d 4 2 8 7 9 4 6 3 5 e ^ 5 f 6 g 7 h 8 i 9 Example: ^ ^ ^ How to lookup parent ^ ^
a b c parent data fc ^ 0 a 1 f d e b 1 ^ 2 1 c ^ 3 g h i d 2 4 ^ 2 3 8 4 6 9 7 5 ^ 2 e 5 3 f ^ 6 g 5 ^ 7 h 5 8 ^ 5 i 9 ^ Lists of children with parent
a b c f d e data firstChild nextSibling g h i d a b c e h g f i 3) Leftmost Child/Right Sibling • Each node contains three parts: a data field and two pointer to its leftmost child and right sibling. ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
Binary tree Tree corresponding A A storage B storage B C E C D D E ^ D ^ ^ D ^ A ^ C ^ E ^ A ^ ^ B A ^ ^ B ^ E ^ ^ E ^ ^ B C ^ D ^ C storage storage 3.5.2 Converting to a Binary Tree Any tree can be transformed into binary tree
A A A B B B C C C D D D A E E E F F F G G G H H H I I I B A E C F D B C D G H E F G H I I Transform a tree into a binary tree