1 / 42

Data Structures: Trees and Grammars

Data Structures: Trees and Grammars. Readings: Sections 6.1, 7.1-7.4 (more from Ch. 6 later). Goals for this Unit. Continue focus on data structures and algorithms Understand concepts of reference-based data structures (e.g. linked lists, binary trees) Some implementation for binary trees

kanan
Download Presentation

Data Structures: Trees and Grammars

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Structures:Trees and Grammars Readings:Sections 6.1, 7.1-7.4 (more from Ch. 6 later)

  2. Goals for this Unit • Continue focus on data structures and algorithms • Understand concepts of reference-based data structures (e.g. linked lists, binary trees) • Some implementation for binary trees • Understand usefulness of trees and hierarchies as useful data models • Recursion used to define data organization

  3. Taxonomy of Data Structures • From the text: • Data type: collection of values and operations • Compare to Abstract Data Type! • Simple data types vs. composite data types • Book: Data structures are composite data types • Definition: a collection of elements that are some combination of primitive and other composite data types

  4. Book’s Classification of Data Structures • Four groupings: • Linear Data Structures • Hierarchical • Graph • Sets and Tables • When defining these, note an element has: • one or more information fields • relationships with other elements

  5. Note on Our Book and Our Course • Our book’s strategy • In Ch. 6, discuss principles of Lists • Give an interface, then implement from scratch • In Ch. 7, discuss principles of Trees • Later, in Ch. 9, see what Java gives us • Our course’s strategy • We did Ch. 9 first. Saw List interfaces and operations • Then, Ch. 8 on maps and sets • Now, trees with some implementation too

  6. Trees Represent… • Concept of a tree verycommon and important. • Tree terminology: • Nodes have one parent • A node’s children • Leaf nodes: no children • Root node: top or start; no parent • Data structures that store trees • Execution or processing that can be expressed as a tree • E.g. method calls as a program runs • Searching a maze or puzzle

  7. Trees are Important • Trees are important for cognition and computation • computer science • language processing (human or computer) • parse trees • knowledge representation (or modeling of the “real world”) • E.g. family trees; the Linnaean taxonomy (kingdom, phylum, …, species); etc.

  8. C:\ MyMail CS120 CS216 school pers lab1 lab2 lab3 list.h list.cpp calc.cpp Another Tree Example: File System • What about file links (Unix) or shortcuts (Windows)?

  9. Another Tree Example: XML and HTML documents How is this a tree? What are the leaves? <HTML> <HEAD>…</HEAD> <BODY> <H1>My Page</H1> <P> Blah <PRE>blah blah</PRE> End </P> </BODY> </HTML>

  10. Tree Data Structures • Why this now? • Very useful in coding • TreeMap in Java Collections Framework • Example of recursive data structures • Methods are recursive algorithms

  11. Tree Definitions and Terms • First, general trees vs. binary trees • Each node in a binary tree has at most two children • General tree definition: • Set of nodes T (possibly empty?) with a distinguished node, the root • All other nodes form a set of disjoint subtrees Ti • each a tree in its own right • each connected to the root with an edge • Note the recursive definition • Each node is the root of a subtree

  12. r T2 T3 T1 Picture of Tree Definition • And all subtrees are recursively defined as: • a node with… • subtrees attached to it

  13. Tree Terminology • A node’s parent • A node’s children • Binary tree: left child and right child • Sibling nodes • Descendants, ancestors • A node’s degree (how many children) • Leaf nodes or terminal nodes • Internal or non-terminal nodes

  14. Recursive Data Structure Recursive Data Structure: a data structure that contains a pointer or reference to an instance of itself: public class TreeNode<T> { T nodeItem; TreeNode<T> left, right; TreeNode<T> parent; … } • Recursion is a natural way to express many algorithms. • For recursive data-structures, recursive algorithms are a natural choice

  15. General Trees • Representing general trees is a bit harder • Each node has a list of child nodes • Turns out that: • Binary trees are simpler and still quite useful • From now on, let’s focus on binary-trees only

  16. ADT Tree • Remember definition on an ADT? • Model of information: we just covered that • Operations? See pages 405-406 in textbook • Many are similar to ADT List or any data structure • The “CRUD” operations: create, replace, update, delete • Important about this list of operations • some are in terms of one specified node, e.g. hasParent() • others are “tree-wide”, e.g. size(), traversal

  17. Classes for Binary Trees (pp. 416-431) • class LinkedBinaryTree (p. 425, bottom) • reference to root BinaryTreeNode • methods: tree-level operations • class BinaryTreeNode (p. 416) • data: an object (of some type) • left: references root of left-subtree (or null) • right: references root of right-subtree (or null) • parent: references this node’s parent node • Could this be null? When should it be? • methods: node-level operations

  18. Two-class Strategy for Recursive Data Structures • Common design: use two classes for a Tree or List • “Top” class • has reference to “first” node • other things that apply to the whole data-structure object (e.g. the tree-object) • both methods and fields • Node class • Recursive definitions are here as references to other node objects • Also data (of course) • Methods defined in this class are recursive

  19. Binary Tree and Node Class • LinkedBinaryTree class has: • reference to root node • reference to a current node, a cursor • non-recursive methods like:boolean find(tgt) // see if tgt is in the whole tree • Node class has: • data, references to left and right subtrees • recursive versions of methods like find:boolean find(tgt) // is tgt here or in my subtrees? • Note: BinaryTree.find() just calls Node.find() on the root node! • Other methods work this way too

  20. Why Does This Matter Now? • This illustrates (again) important design ideas • The tree itself is what we’re interested in • There are tree-level operations on it (“ADT level” operations) • The implementation is a recursive data structure • There are recursive methods inside the lower-level classes that are closely related (same name!) to the ADT-level operation • Principles? abstraction (hiding details), delegation (helper classes, methods)

  21. ADT Tree Operations: “Navigation” • Positioning: • toRoot(), toParent(), toLeftChild(), toRightChild(), find(Object o) • Checking: • hasParent(), hasLeftChild(), etc. • equals(Object tree2) • Book calls this a “deep compare” • Do two distinct objects have the same structure and contents?

  22. ADT Tree Operations: Mutators • Mutators: • insertRight(Object o), insertLeft(Object o) • create a new node containing new data • make this new node be the child of the current node • Important: We use these to build trees! • prune() • delete the subtree rooted by the current node

  23. Next: Implementation • Next (in the book) • How to implement Java classes for binary trees • Class for node, another class for BinTree • Interface for both, then two implementations (array and reference) • But for us: • We’ll skip some of this, particularly the array version • We’ll only look at reference-base implementation • After that: concept of a binary search tree

  24. Understanding Implementations • Let’s review some of the methods on pp. 416-431 • (Done in class, looking at code in book.) • Some topics discussed: • Node class. Parent reference or not? • Are two trees equal? • Traversal strategies: Section 7.3.2 in book • visit() method and callback (also 7.3.2)

  25. Binary Search Trees • We often need collections that store items • Maybe a long series of inserts or deletions • We want fast lookup, and often we want to access in sorted order • Lists: O(n) lookup • Could sort them for O(lg n) lookup • Cost to sort is O(n lg n) and we might need to re-sort often as we insert, remove items • Solution: search tree

  26. Binary Search Trees • Associated with each node is a key value that can be compared. • Binary search tree property: • every node in the left subtree has key whose value is less than the value of the root’s key value, and • every node in the right subtree has key whose value is greater than the value of the root’s key value.

  27. Example 5 4 8 1 7 11 3 BINARY SEARCH TREE

  28. Counterexample 8 5 11 2 6 10 18 7 4 15 20 NOT A BINARY SEARCH TREE 21

  29. Find and Insert in BST • Find: look for where it should be • If not there, that’s where you insert

  30. Recursion and Tree Operations • Recursive code for tree operations is simple, natural, elegant • Example: pseudo-code for Node.find() boolean find(Comparable tgt) { Node next = null; if (this.data matches tgt) return true else if (tgt’s data < this.data) next = this.leftChild else // tgt’s data > this.data next = this.rightChild // next points to left or right subtree if (next == null ) return false // no subtree else return next.find(tgt) // search on }

  31. Order in BSTs • How could we traverse a BST so that the nodes are visited in sorted order? • Does one of our traversal strategies work? • A very useful property about BSTs • Consider Java’s TreeSet and TreeMap • A search tree (not a BST, but be one of its better “cousins”) • In CS2150: AVL trees, Red-Black trees • Guarantee: search times are O(lg n)

  32. Deleting from a BST • Removing a node requires • Moving its left and right subtrees • Not hard if one not there • But if both there? • Answer: not too tough, but wait for CS2150 to see! • In CS2110, we’ll not worry about this

  33. Next: Grammars, Trees, Recursion • Languages are governed by a set of rules called a grammar • Is a statement legal ? • Generate or derive a new legal statement • Natural language grammars • Language processing by computers • But, grammars used a lot in computing • Grammar for a programming language • Grammar for valid inputs, messages, data, etc.

  34. Backus-Naur Form • http://en.wikipedia.org/wiki/Backus-Naur_form • BNF is a widely-used notation for describing the grammar or formal syntax of programming languages or data • BNF specifics a grammar as a set of derivation rules of this form: <symbol> ::= <expression with symbols> • Look at website and example there (also on next slide) • How are trees involved here? Is it recursive?

  35. BNF for Postal Address • <postal-address> ::= <name-part> <street-address> <zip-part> • <personal-part> ::= <first-name> | <initial> "." • <name-part> ::= <personal-part> <last-name> [<jr-part>] <EOL> | <personal-part> <name-part> • <street-address> ::= [<apt>] <house-num> <street-name> <EOL> • <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL> Example: Ann Marie G. Jones 123 Main St. Hooville, VA 22901 Where’s the recursion?

  36. Grammars in Language • Rule-based grammars describe • how legal statements can be produced • how to tell if a statement is legal • Study textbook, pp. 389-391, to see rule-based grammar for simple Java-like arithmetic expressions • four rules for expressions, terms, factors, and letter • Study how a (possibly) legal statement is parsed to generate a parse tree

  37. Computing Parse-Tree Example • Expression: a * b + c

  38. Grammar Terms and Concepts • First, this is what’s called a context-free grammar • For CS2110, let’s not worry about what this means! (But in CS2102, you learn this.) • A CFG has • a set of variables (AKA non-terminals) • a set of terminal symbols • a set of productions • a starting symbol

  39. Previous Parse Tree • Terminal symbols: • <operator> could be: + * • <letter> could be: a b c • Production: <factor>  <letter> | <number>

  40. Natural Language Parse Tree • Statement: The man bit the dog

  41. How Can We Use Grammars? • Parsing • Is a given statement a valid statement in the language? (Is the statement recognized by the grammar?) • Note this is what the Java compiler does as a first step toward creating an executable form of your program. (Find errors, or build executable.) • Production • Generate a legal statement for this grammar • Demo: generate random statements! • See link on website next to slides

  42. Demo’s Poem-grammar data file { <start> The <object> <verb> tonight } { <object> waves big yellow flowers slugs } { <verb> sigh <adverb> portend like <object> die <adverb> } { <adverb> warily grumpily } • Note: no recursive productions in this example!

More Related