1 / 24

Searching

Searching. Motivation. Find parts for a system Find an address for name Find a criminal fingerprint/DNA match Locate all employees in a dept. based on a collection of criteria across multiple tables Find shortest path (network, roads, etc.). Linear Search.

Download Presentation

Searching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Searching

  2. Motivation • Find parts for a system • Find an address for name • Find a criminal • fingerprint/DNA match • Locate all employees in a dept. • based on a collection of criteria • across multiple tables • Find shortest path (network, roads, etc.)

  3. Linear Search • Items to be searched are in a listx0, x1, … xn-1 • Need = = and < operators defined for the type • Start with item 0 • until (end of list or target found) • compare another item • Best case, found on 1st comparison • Worst case, found on nth comparison

  4. Linear Search – vector/array #include <vector>int LinearSearch (constvector int &v,const int &item){ for (i=0; i< v.size ;i++ )if (item == v[i]) return 1; return 0; // not found} // # of compares: 1+2+3+…+n = ½ * n (n+1) // average = ½*n*(n+1)/n = (n+1)/2 ≈ n/2 // average search time is O(n/2)= O(n)

  5. Linear Search – single-linked list int LinearSearch(NodePointerfirst,constint &item){loc= first; for ( ; loc != NULL ;loc=loc->next) { if (item == loc->data) return 1; } return 0; // not found} Worst case computing time is still O(n)

  6. Binary search • Significantly faster than linear • Repeatedly "halving" the problem • We can divide a set of n items in half at most log2ntimes • For performance COMPARISONS, we ignore the base for the log (2 in this case) • Complexity of binary search is O(log n)

  7. Some Observations • Binary usually outperforms linear search • Both require sequential storage • Data must be ordered (sorted) • Searching is done on a "key" • a piece of data unique to each item • often smaller than the actual data • e.g.; your B-number vs. whole name • A non-linear linked structure is better • there are several kinds of "tree" structures

  8. Big Oh - Formal Definition (again) • f(n)=O(g(n)) • Thus, g(n) is an upper bound on f(n) • Note:f(n) = O(g(n)) "f(n) has a complexity of g(n)"this is NOT the same as O(g(n)) = f(n) • The '=' is not the usual "=" operator (it is not reflexive)

  9. Trees • A data structure which consists of • a finite set of elements called nodesor vertices • a finite set of directed arcs which connect the nodes • If the tree is nonempty • one of the nodes (the root) has no incoming arc • all other nodes can be reached by following unique sequences of consecutive arcs

  10. Trees • Each node has n >= 0 children • Topmost node is the "root" • Binary tree: each node has 0, 1, 2 children • Engineering uses: • Huffman encoding (data compression) • expression evaluation • sorting & searching • electric power distribution grid • go/no-go decision-making

  11. Binary Search Tree • Consider an ordered list of integers • Examine middle element • Examine left, right sublist (maintain pointers) • (Recursively) examine left, right sublists 5 22 45 52 63 75 90

  12. Binary Search Tree • Redraw as a treelike shape • this is a binary tree root 22 45 52 subtree parent of 63, 90 75 63 90 5 leaves children of 75

  13. Binary Search Tree (BST) • A binary tree • Left-child <= Parent value <= Right child • Several tree operations available • construction • test for empty • search for item • insert • delete • traverse (visit a node exactly once)

  14. Binary Tree terms • Full tree (proper tree, 2-tree, strictly binary) • all nodes have exactly 0 or 2 children • Complete tree • all levels (except maybe the last) are filled • all leaves are "pushed" left • Balanced tree • L/R sub-trees of EVERY node differ by no more than 1 level. • Perfect tree – all leaves at same depth

  15. Implementations • An array can be used • insertion, deletion, re-arranging VERY messy • searching, sorting inefficient • not useful for "sparse" trees (missing data) • very hard to traverse recursively • Linked tree • nodes like those in Stacks, Queues & Lists • pointer to left-child • pointer to right-child • data

  16. Recursive Descent • A binary tree is either empty or • Has a data-node (root) with 2 subtreeptrs • left-tree • right-tree • the subtrees are disjoint • Each sub-tree follows the same definition • Leads to simple recursive search programs

  17. Recursive Tree Traversal void Traverse (node* ptr) {if the binary tree is empty (ptr==NULL) thenreturn; else // recursion here{ Process root data (ptr → data) Traverse (ptr → left); Traverse (ptr → right); }

  18. 3 possible traversals • Pre-order • data, left-sub-tree, right-sub-tree • first-touch • In-order • left-sub-tree, data, right-sub-tree • 2nd touch • Post-order • left-sub-tree, right-sub-tree, data • last-touch

  19. Traversal Order • Given expressionA – B * C + D • Operator precedence is: ^ * / + - • This is normal infix order • Each operand is • The child of a parent node • Parent node, • for the corresponding operator

  20. The expression tree + D - * A C B

  21. Remaining traversals • Prefix + - A * B C D • Postfix (Reverse Polish Notation – RPN) A B C * - D +

  22. Stack Applications • base-ten to base-two conversion • remainders need to be printed in reverse order in which they are calculated • run-time stack of function activation records • push when a function is called • pop when a function exits • arithmetic expression evaluation • easier to evaluate when stored in postfix (RPN) • infix to postfix conversion algorithm uses a stack • evaluating postfix is easy using a stack for operands

  23. Infix to Postfix infix expression: (3 + 4) * 5 - 2 postfix expression: 3 4 + 5 * 2 - • scan input from L to R • if operand, output it • else // must be an operator or "(" or ")" • if "(" then push & loop • if operator & prec(top)< prec(input) • pop & output until > = prec(top)<prec(input) • push the input • if ")" • pop & output until "(" • remove & discard the "(" • discard the incoming ")" • if end of input, pop & output until empty

  24. Evaluating Postfix • use a stack to evaluate a postfix expression • read values into stack until operator reached • pop 2 values and apply operator • be sure to maintain order of operands • a-b is not the same as b-a • push result value onto stack • repeat until no input and stack is empty postfix expression: 3 4 + 5 * 2 - push 3 and 4 see the +, pop 3, 4 add, push 7 push 5 see the * pop 7, 5 multiply, push 35 push the 2, then see the – so pop 35, 2 and subtract

More Related