1 / 40

Compsci 201 Trees, Tries, Tradeoffs

Explore the main ideas of DNA LinkStrand and the importance of splicing in low level links. Discover the basics of cut and splice and the benefits of using linked lists. Examine the complexity of string concatenation and the advantages of using StringBuilder. Learn how the JVM can optimize your code and when to optimize.

hallb
Download Presentation

Compsci 201 Trees, Tries, Tradeoffs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compsci 201Trees, Tries, Tradeoffs Owen Astrachan Jeff Forbes November 1, 2017 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  2. R is for … • R • Programming language of choice in Stats • Random • From Monte-Carlo to [0,1) • Recursion • Base case of wonderful stuff • Refactoring • Better not different Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  3. Plan for the Day • Tree Review • From Theory to Practice • From Recurrences to Code • What are the main ideas in DNA LinkStrand • Why Splicing matters with low level links • Trees, Tries, Tradeoffs Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  4. From Links to … • What is the DNA/LinkStrand assignment about? • Why do we study linked lists • How do you work in a group? Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  5. Basics of cutAndSplice • Find enzyme like ‘gat’ • Replace with splicee like ‘gggtttaaa’ • Strings and StringBuilderfor creating new strings • Complexity of “hello” + “world”, or A+B • String: A + B, StringBuilder: B Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  6. What do linked lists get us? • Faster run-time, much better use of memory • We splice in constant time? Re-use strings Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  7. String Concatenation Examined • https://coursework.cs.duke.edu/201fall17/d9-linked-trees/blob/master/src/StringPlay.java • Runtime of stringConcat(“hello”,N) • Depends on size of ret, 5, 10, 15, … 5*N public String stringConcat(String s, intreps) { String ret = ""; for(intk=0; k < reps; k++) { ret+= s; } returnret; } Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  8. StringBuilder Examined • https://coursework.cs.duke.edu/201fall17/d9-linked-trees/blob/master/src/StringPlay.java • Runtime of builderConcat(“hello”,N) • 5 + 5 + 5 + … + 5 a total of N times public String builderConcat(String s, intreps) { StringBuilderret = newStringBuilder(); for(intk=0; k < reps; k++) { ret.append(s); } returnret.toString(); } Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  9. Theory and Practice • The JVM can sometimes optimize your code • Don’t optimize what you don’t have to … • http://www.pellegrino.link/2015/08/22/string-concatenation-with-java-8.html • WOTO http://bit.ly/201fall17-nov1-strings Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  10. dana boyd Dr. danahboyd is a Senior Researcher at Microsoft Research, … a Visiting Researcher at Harvard Law School, …Her work examines everyday practices involving social media, with specific attention to youth engagement, privacy, and risky behaviors. She recently co-authored Hanging Out, Messing Around, and Geeking Out: Kids Living and Learning with New Media "From day one, Mark Zuckerberg wanted Facebook to become a social utility. He succeeded. Facebook is now a utility for many. The problem with utilities is that they get regulated." http://bit.ly/ySwjyl Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  11. Trees: no data structure lovelier?

  12. Trees from Bottom to Top • Trees!! Trying to get the best of a few worlds: • efficient lookup: like sorted array • efficient add: like linked list • range-queries: see java.util.NavigableSet • Reminder: hashing is really, really fast • O(1) add, search, delete independent of N • BUT! No order info, worst case can be bad

  13. Binary Trees • Binary trees: Not just for searching, used in many contexts, • Game trees, collisions, … • Cladistics, genomics, quad trees, … • Search is O(log n) like sorted array • Average case. Note: worst case also be O(log n), e.g., use a balanced tree • insertion/deletion O(1), once location found

  14. “llama” “giraffe” “tiger” “jaguar” “monkey” “elephant” “leopard” “koala” “koala” “koala” “koala” “koala” “pig” “hippo” How do search trees work? • Change doubly-linked lists, no longer linear • Similar to binary search, everything less goes left, everything greater goes right • How do we search? • How do we insert? • Insert: “koala”

  15. A C B F D E G Review: tree terminology • Binary tree is a structure: • empty • root node with left and rightsubtrees • Tree Terminology • parent and child: A is parent of B, E is child of B • leaf node has no children, internal node has 1 or 2 children • path is sequence of nodes (edges), N1, N2, … Nk • Ni is parent of Ni+1 • depth (level) of node: length of root-to-node path • level of root is 1 (measured in nodes) • height of node: length of longest node-to-leaf path • height of tree is height of root

  16. “llama” “giraffe” “tiger” A TreeNode by any other name… • What does this look like? Doubly linked list? public class TreeNode { TreeNode left; TreeNode right; String info; TreeNode(String s, TreeNodellink, TreeNoderlink){ info = s; left = llink; right = rlink; } }

  17. Tree function: Tree height • Compute tree height (longest root-to-leaf path) intheight(Tree root) { if (root == null) return 0; else { return 1 + Math.max(height(root.left), height(root.right)); } } • Find height of left subtree, height of right subtree • Use results to determine height of tree

  18. Tree function: Leaf Count • Calculate Number of Leaf Nodes intleafCount(Tree root) { if (root == null) return 0; if (root.left == null && root.right == null) return 1; return leafCount(root.left) + leafCount(root.right); } • Similar to height: but has two base case(s) • Use results of recursive calls to determine # leaves

  19. Tree functions Analyzed intheight(Tree root) { if (root == null) return 0; else { return 1 + Math.max(height(root.left), height(root.right)); } } • Let T(n) be time for height to run on n-node tree T(n) = 2T(n/2) + O(1) - roughly balanced T(n) = T(n-1) + T(1) + O(1) = T(n-1) + O(1) - unbalanced

  20. Good Search Trees and Bad Trees http://www.9wy.net/onlinebook/CPrimerPlus5/ch17lev1sec7.html

  21. Bad Trees and Good Trees Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  22. Don’t do this at home? • Let T(n) be time for height to execute (n-node tree) • T(n) = T(n/2) + T(n/2) + O(1) • T(n) = 2 T(n/2) + 1 • T(n) = 2 [2(T(n/4) + 1] + 1 • T(n) = 4T(n/4) + 2 + 1 • T(n) = 8T(n/8) + 4 + 2 + 1, eureka! • T(n) = 2kT(n/2k) + 2k-1 why true? • T(n) = nT(1) + O(n) is O(n) if we let n=2k • Different recurrence, same solution if unbalanced

  23. Recurrence Table Develop recurrence, look up solution Remember: goal is big-Oh, recurrence helps Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  24. Balanced Trees and Complexity • A tree is height-balanced if • Left and right subtrees are height-balanced • Left and right heights differ by at most one booleanisBalanced(Tree root){ if (root == null) return true; return isBalanced(root.left) && isBalanced(root.right) && Math.abs(height(root.left)–height(root.right)) <= 1; }

  25. What is complexity? • Assume trees “balanced” in analyzing complexity • Roughly half the nodes in each subtree • Leads to easier analysis • How to develop recurrence relation? • What is T(n)? Time func executes on n-node tree • What other work? Express recurrence, solve it • How to solve recurrence relation • Plug, expand, plug, expand, find pattern • Proof requires induction to verify correctness

  26. Recurrence relation • T(n): time for isBalancedto execute (n-node tree) • T(n) = T(n/2) + T(n/2) + O(n) • T(n) = 2 T(n/2) + n • T(n) = 2 [2(T(n/4) + n/2] + n • T(n) = 4T(n/4) + n + n = 4T(n/4) + 2n • T(n) = 8T(n/8) + 3n, eureka! • T(n) = 2kT(n/2k) + knwhy true? • T(n) = nT(1) + n log(n) let n=2k, so k=log n • So, solution for T(n) = 2T(n/2) + O(n) is • O(n log n) -- base 2, but base doesn't matter

  27. “llama” “giraffe” “tiger” “jaguar” “monkey” “elephant” “leopard” “pig” “hippo” Printing a search tree in order • When is root printed? • After left subtree, before right subtree. void visit(TreeNode t){ if (t != null) { visit(t.left); System.out.println(t.info); visit(t.right); } } • Inorder traversal

  28. “llama” “giraffe” “tiger” “jaguar” “monkey” “elephant” Tree traversals • Inorder visits search tree in order • Visit left-subtree, process root, visit right-subtree elephant, giraffe, jaguar, llama, monkey, tiger • Navigate following nodes/links? • Visit on passing under • Second time by node

  29. “llama” “giraffe” “tiger” “jaguar” “monkey” “elephant” Tree traversals • Preorder good for reading/writing trees • Process root, then Visit left-subtree, visit right-subtree llama, giraffe,elephant jaguar, tiger, monkey • Navigate following nodes/links? • Visit on passing-by to left • First time by node

  30. “llama” “giraffe” “tiger” “jaguar” “monkey” “elephant” Tree traversals • Post order good for deleting tree • Visit left-subtree, visit right-subtree, process root elephant, jaguar, giraffe, monkey, tiger, llama • Navigate following nodes/links? • Visit on passing up • Third time by node

  31. Tree WOTO http://bit.ly/201f17-nov1-trees Pride in social group? Urban dictionary? Compsci 201, Fall 2017, Tree, Tries, Tradoffs

  32. What does insertion look like? • Simple recursive insertion into tree (accessed by root) root = insert("foo", root); TreeNode insert(TreeNode t, String s) { if (t == null) t = new Tree(s,null,null); else if (s.compareTo(t.info) <= 0) t.left = insert(t.left,s); else t.right = insert(t.right,s); return t; }

  33. Notes on tree insert and search • Ineachrecursive insert call • Tree parameter in the call is either the left or right field of some node in the original tree • Will be assignment to a .left or .right field! • Idiom t = treeMethod(t,…) used • Good trees go bad, what happens and why? • Insert alpha, beta, gamma, delta, epsilon, … • https://coursework.cs.duke.edu/201fall17/d9-linked-trees/blob/master/src/TreePlay.java

  34. Insert and Removal • For insertion we can use iteration (see BSTSet) • Traverse left or right and “look ahead” to add • Removal is tricky, depends on number of children • Straightforward when zero or one child • Complicated when two children, find successor • See set code for complete cases • If right child, straightforward • Otherwise find node that’s left child of its parent (why?)

  35. Wordladder Story • Ladder from ‘white’ to ‘house’ • white, while, whale, shale, … • I can do that… optimally • My brother was an English major • My ladder is 16, his is 15, how? • There's a ladder that's 14 words! • The key is ‘sough’ • Guarantee optimality! • QUEUE Compsci 201, Fall 2017, Linked Lists & More

  36. Queue for shortest path public booleanfindLadder(String[] words, String first, String last){ Queue<String> qu = new LinkedList<>(); Set<String> set = new HashSet<>(); qu.add(first); while (qu.size() > 0){ String current = qu.remove(); if (oneAway(current,last)) return true; for(String s : words){ if (! set.contains(s) && oneAway(from,s)){ qu.add(s); set.(s); } } } return false; } Compsci 201, Fall 2017, Linked Lists & More

  37. Shortest Path reprised • How does Queue ensure we find shortest path? • Where are words one away from first? • Where are words two away from first? • Why do we need to avoid revisiting a word, when? • Why do we use a set for this? Why a HashSet? • Alternatives? • If we want the ladder, not just whether it exists • What's path from white to house? We know there is one. Compsci 201, Fall 2017, Linked Lists & More

  38. Shortest path proof • All words one away from start on queue first iteration • What is first value of current when loop entered? • All one-away words dequeued before two-away • See previous assertion, property of queues • Two-away before 3-away, … • Each 2-away word is one away from a 1-away word • So all enqueued after one-away, before three-away • Any w seen/dequeued that's n:awayis: • Seen before every n+k:awayword for k >=1! Compsci 201, Fall 2017, Linked Lists & More

  39. Keeping track of ladder • Find w, a one-away word from current • Enqueue w if not seen • Call map.put(w,current) • Remember keys are unique! • Put word on queue once! • map.put("lot", "hot") • map.put("dot", "hot") • map.put("hat", "hot") Compsci 201, Fall 2017, Linked Lists & More

  40. Reconstructing Word Ladder • Run WordLaddersFull • https://coursework.cs.duke.edu/201fall17/wordladders/blob/master/src/WordLaddersFull.java • See map and call to map.put(word,current) • What about when returning the ladder, why is the returned ladder in reverse order? • What do we know about code when statement adding (key,value) to map runs? Compsci 201, Fall 2017, Linked Lists & More

More Related