240 likes | 336 Views
Searching. Motivation. Find parts for a system Find an address for name Find a criminal fingerprint/DNA match Locate all employees in a dept. based on a collection of criteria across multiple tables Find shortest path (network, roads, etc.). Linear Search.
E N D
Motivation • Find parts for a system • Find an address for name • Find a criminal • fingerprint/DNA match • Locate all employees in a dept. • based on a collection of criteria • across multiple tables • Find shortest path (network, roads, etc.)
Linear Search • Items to be searched are in a listx0, x1, … xn-1 • Need = = and < operators defined for the type • Start with item 0 • until (end of list or target found) • compare another item • Best case, found on 1st comparison • Worst case, found on nth comparison
Linear Search – vector/array #include <vector>int LinearSearch (constvector int &v,const int &item){ for (i=0; i< v.size ;i++ )if (item == v[i]) return 1; return 0; // not found} // # of compares: 1+2+3+…+n = ½ * n (n+1) // average = ½*n*(n+1)/n = (n+1)/2 ≈ n/2 // average search time is O(n/2)= O(n)
Linear Search – single-linked list int LinearSearch(NodePointerfirst,constint &item){loc= first; for ( ; loc != NULL ;loc=loc->next) { if (item == loc->data) return 1; } return 0; // not found} Worst case computing time is still O(n)
Binary search • Significantly faster than linear • Repeatedly "halving" the problem • We can divide a set of n items in half at most log2ntimes • For performance COMPARISONS, we ignore the base for the log (2 in this case) • Complexity of binary search is O(log n)
Some Observations • Binary usually outperforms linear search • Both require sequential storage • Data must be ordered (sorted) • Searching is done on a "key" • a piece of data unique to each item • often smaller than the actual data • e.g.; your B-number vs. whole name • A non-linear linked structure is better • there are several kinds of "tree" structures
Big Oh - Formal Definition (again) • f(n)=O(g(n)) • Thus, g(n) is an upper bound on f(n) • Note:f(n) = O(g(n)) "f(n) has a complexity of g(n)"this is NOT the same as O(g(n)) = f(n) • The '=' is not the usual "=" operator (it is not reflexive)
Trees • A data structure which consists of • a finite set of elements called nodesor vertices • a finite set of directed arcs which connect the nodes • If the tree is nonempty • one of the nodes (the root) has no incoming arc • all other nodes can be reached by following unique sequences of consecutive arcs
Trees • Each node has n >= 0 children • Topmost node is the "root" • Binary tree: each node has 0, 1, 2 children • Engineering uses: • Huffman encoding (data compression) • expression evaluation • sorting & searching • electric power distribution grid • go/no-go decision-making
Binary Search Tree • Consider an ordered list of integers • Examine middle element • Examine left, right sublist (maintain pointers) • (Recursively) examine left, right sublists 5 22 45 52 63 75 90
Binary Search Tree • Redraw as a treelike shape • this is a binary tree root 22 45 52 subtree parent of 63, 90 75 63 90 5 leaves children of 75
Binary Search Tree (BST) • A binary tree • Left-child <= Parent value <= Right child • Several tree operations available • construction • test for empty • search for item • insert • delete • traverse (visit a node exactly once)
Binary Tree terms • Full tree (proper tree, 2-tree, strictly binary) • all nodes have exactly 0 or 2 children • Complete tree • all levels (except maybe the last) are filled • all leaves are "pushed" left • Balanced tree • L/R sub-trees of EVERY node differ by no more than 1 level. • Perfect tree – all leaves at same depth
Implementations • An array can be used • insertion, deletion, re-arranging VERY messy • searching, sorting inefficient • not useful for "sparse" trees (missing data) • very hard to traverse recursively • Linked tree • nodes like those in Stacks, Queues & Lists • pointer to left-child • pointer to right-child • data
Recursive Descent • A binary tree is either empty or • Has a data-node (root) with 2 subtreeptrs • left-tree • right-tree • the subtrees are disjoint • Each sub-tree follows the same definition • Leads to simple recursive search programs
Recursive Tree Traversal void Traverse (node* ptr) {if the binary tree is empty (ptr==NULL) thenreturn; else // recursion here{ Process root data (ptr → data) Traverse (ptr → left); Traverse (ptr → right); }
3 possible traversals • Pre-order • data, left-sub-tree, right-sub-tree • first-touch • In-order • left-sub-tree, data, right-sub-tree • 2nd touch • Post-order • left-sub-tree, right-sub-tree, data • last-touch
Traversal Order • Given expressionA – B * C + D • Operator precedence is: ^ * / + - • This is normal infix order • Each operand is • The child of a parent node • Parent node, • for the corresponding operator
The expression tree + D - * A C B
Remaining traversals • Prefix + - A * B C D • Postfix (Reverse Polish Notation – RPN) A B C * - D +
Stack Applications • base-ten to base-two conversion • remainders need to be printed in reverse order in which they are calculated • run-time stack of function activation records • push when a function is called • pop when a function exits • arithmetic expression evaluation • easier to evaluate when stored in postfix (RPN) • infix to postfix conversion algorithm uses a stack • evaluating postfix is easy using a stack for operands
Infix to Postfix infix expression: (3 + 4) * 5 - 2 postfix expression: 3 4 + 5 * 2 - • scan input from L to R • if operand, output it • else // must be an operator or "(" or ")" • if "(" then push & loop • if operator & prec(top)< prec(input) • pop & output until > = prec(top)<prec(input) • push the input • if ")" • pop & output until "(" • remove & discard the "(" • discard the incoming ")" • if end of input, pop & output until empty
Evaluating Postfix • use a stack to evaluate a postfix expression • read values into stack until operator reached • pop 2 values and apply operator • be sure to maintain order of operands • a-b is not the same as b-a • push result value onto stack • repeat until no input and stack is empty postfix expression: 3 4 + 5 * 2 - push 3 and 4 see the +, pop 3, 4 add, push 7 push 5 see the * pop 7, 5 multiply, push 35 push the 2, then see the – so pop 35, 2 and subtract