1 / 64

Understanding Python Data Collections

Learn different types of collections in Python, including arrays and linked structures. Understand operations, trade-offs, and implementations of various data structures.

weinerm
Download Presentation

Understanding Python Data Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fundamentals of Python:From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures

  2. Objectives After completing this chapter, you will be able to: • Recognize different categories of collections and the operations on them • Understand the difference between an abstract data type and the concrete data structures used to implement it • Perform basic operations on arrays, such as insertions and removals of items Fundamentals of Python: From First Programs Through Data Structures

  3. Objectives (continued) • Resize an array when it becomes too small or too large • Describe the space/time trade-offs for users of arrays • Perform basic operations, such as traversals, insertions, and removals, on linked structures • Explain the space/time trade-offs of arrays and linked structures in terms of the memory models that underlie these data structures Fundamentals of Python: From First Programs Through Data Structures

  4. Overview of Collections • Collection: Group of items that we want to treat as conceptual unit • Examples: • Lists, strings, stacks, queues, binary search trees, heaps, graphs, dictionaries, sets, and bags • Can be homogeneous or heterogeneous • Lists are heterogeneous in Python • Four main categories: • Linear, hierarchical, graph, and unordered Fundamentals of Python: From First Programs Through Data Structures

  5. Linear Collections • Ordered by position • Everyday examples: • Grocery lists • Stacks of dinner plates • A line of customers waiting at a bank Fundamentals of Python: From First Programs Through Data Structures

  6. Hierarchical Collections • Structure reminiscent of an upside-down tree • D3’s parent is D1; its children are D4, D5, and D6 • Examples: a file directory system, a company’s organizational tree, a book’s table of contents Fundamentals of Python: From First Programs Through Data Structures

  7. Graph Collections • Graph: Collection in which each data item can have many predecessors and many successors • D3’s neighbors are its predecessors and successors • Examples: Maps of airline routes between cities; electrical wiring diagrams for buildings Fundamentals of Python: From First Programs Through Data Structures

  8. Unordered Collections • Items are not in any particular order • One cannot meaningfully speak of an item’s predecessor or successor • Example: Bag of marbles Fundamentals of Python: From First Programs Through Data Structures

  9. Operations on Collections Fundamentals of Python: From First Programs Through Data Structures

  10. Operations on Collections (continued) Fundamentals of Python: From First Programs Through Data Structures

  11. Abstraction and Abstract Data Types • To a user, a collection is an abstraction • In CS, collections are abstract data types (ADTs) • ADT users are concerned with learning its interface • Developers are concerned with implementing their behavior in the most efficient manner possible • In Python, methods are the smallest unit of abstraction, classes are the next in size, and modules are the largest • We will implement ADTs as classes or sets of related classes in modules Fundamentals of Python: From First Programs Through Data Structures

  12. Data Structures for Implementing Collections: Arrays • “Data structure” and “concrete data type”refer to the internal representation of an ADT’s data • The two data structures most often used to implement collections in most programming languages are arrays and linked structures • Different approaches to storing and accessing data in the computer’s memory • Different space/time trade-offs in the algorithms that manipulate the collections Fundamentals of Python: From First Programs Through Data Structures

  13. The Array Data Structure • Array: Underlying data structure of a Python list • More restrictive than Python lists • We’ll define an Arrayclass Fundamentals of Python: From First Programs Through Data Structures

  14. Random Access and Contiguous Memory • Array indexing is a random access operation • Address of an item: base address + offset  Index operation has two steps: Fundamentals of Python: From First Programs Through Data Structures

  15. Static Memory and Dynamic Memory • Arrays in older languages were static • Modern languages support dynamic arrays • To readjust length of an array at run time: • Create an array with a reasonable default size at start-up • When it cannot hold more data, create a new, larger array and transfer the data items from the old array • When the array seems to be wasting memory, decrease its length in a similar manner • These adjustments are automatic with Python lists Fundamentals of Python: From First Programs Through Data Structures

  16. Physical Size and Logical Size • The physical size of an array is its total number of array cells • The logical size of an array is the number of items currently in it • To avoid reading garbage, must track both sizes Fundamentals of Python: From First Programs Through Data Structures

  17. Physical Size and Logical Size (continued) • In general, the logical and physical size tell us important things about the state of the array: • If the logical size is 0, the array is empty • Otherwise, at any given time, the index of the last item in the array is the logical size minus 1. • If the logical size equals the physical size, there is no more room for data in the array Fundamentals of Python: From First Programs Through Data Structures

  18. Operations on Arrays • We now discuss the implementation of several operations on arrays • In our examples, we assume the following data settings: • These operations would be used to define methods for collections that contain arrays Fundamentals of Python: From First Programs Through Data Structures

  19. Increasing the Size of an Array • The resizing process consists of three steps: • Create a new, larger array • Copy the data from the old array to the new array • Reset the old array variable to the new array object • To achieve more reasonable time performance, double array size each time you increase its size: Fundamentals of Python: From First Programs Through Data Structures

  20. Decreasing the Size of an Array • This operation occurs in Python’s listwhen a popresults in memory wasted beyond a threshold • Steps: • Create a new, smaller array • Copy the data from the old array to the new array • Reset the old array variable to the new array object Fundamentals of Python: From First Programs Through Data Structures

  21. Inserting an Item into an Array That Grows • Programmer must do four things: • Check for available space and increase the physical size of the array, if necessary • Shift items from logical end of array to target index position down by one • To open hole for new item at target index • Assign new item to target index position • Increment logical size by one Fundamentals of Python: From First Programs Through Data Structures

  22. Inserting an Item into an Array That Grows (continued) Fundamentals of Python: From First Programs Through Data Structures

  23. Removing an Item from an Array • Steps: • Shift items from target index position to logical end of array up by one • To close hole left by removed item at target index • Decrement logical size by one • Check for wasted space and decrease physical size of the array, if necessary • Time performance for shifting items is linear on average; time performance for removal is linear Fundamentals of Python: From First Programs Through Data Structures

  24. Removing an Item from an Array (continued) Fundamentals of Python: From First Programs Through Data Structures

  25. Complexity Trade-Off: Time, Space, and Arrays • Average-case use of memory for is O(1) • Memory cost of using an array is its load factor Fundamentals of Python: From First Programs Through Data Structures

  26. Two-Dimensional Arrays (Grids) • Sometimes, two-dimensional arrays or grids are more useful than one-dimensional arrays • Suppose we call this grid table; to access an item: Fundamentals of Python: From First Programs Through Data Structures

  27. Processing a Grid • In addition to the double subscript, a grid must recognize two methods that return the number of rows and the number of columns • Examples: getHeightand getWidth • The techniques for manipulating one-dimensional arrays are easily extended to grids: Fundamentals of Python: From First Programs Through Data Structures

  28. initial fill value height width Creating and Initializing a Grid • Let’s assume that there exists a Gridclass: • After a grid has been created, we can reset its cells to any values: Fundamentals of Python: From First Programs Through Data Structures

  29. Defining a Grid Class Fundamentals of Python: From First Programs Through Data Structures

  30. Defining a Grid Class (continued) • The method __getitem__is all that is needed to support the client’s use of the double subscript: Fundamentals of Python: From First Programs Through Data Structures

  31. Ragged Grids and Multidimensional Arrays • In a ragged grid, there are a fixed number of rows, but the number of columns in each row can vary • An array of lists or arrays provides a suitable structure for implementing a ragged grid • Dimensions can be added to grid definition if needed; the only limit is computer’s memory • A three-dimensional array can be visualized as a box filled with smaller boxes stacked in rows and columns • Depth, height, and width; three indexes; three loops Fundamentals of Python: From First Programs Through Data Structures

  32. Linked Structures • After arrays, linked structures are probably the most commonly used data structures in programs • Like an array, a linked structure is a concrete data type that is used to implement many types of collections, including lists • We discuss in detail several characteristics that programmers must keep in mind when using linked structures to implement any type of collection Fundamentals of Python: From First Programs Through Data Structures

  33. An empty link Singly Linked Structures and Doubly Linked Structures • No random access; must traverse list • No shifting of items needed for insertion/removal • Resize at insertion/removal with no memory cost Fundamentals of Python: From First Programs Through Data Structures

  34. Noncontiguous Memory and Nodes • A linked structure decouples logical sequence of items in the structure from any ordering in memory • Noncontiguous memory representation scheme • The basic unit of representation in a linked structure is a node: Fundamentals of Python: From First Programs Through Data Structures

  35. Noncontiguous Memory and Nodes (continued) • Depending on the language, you can set up nodes to use noncontiguous memory in several ways: • Using two parallel arrays Fundamentals of Python: From First Programs Through Data Structures

  36. Noncontiguous Memory and Nodes (continued) • Ways to set up nodes to use noncontiguous memory (continued): • Using pointers (a nullor nil represents the empty link as a pointer value) • Memory allocated from the object heap • Using references to objects (e.g., Python) • In Python, None can mean an empty link • Automatic garbage collection frees programmer from managing the object heap • In the discussion that follows, we use the terms link, pointer, and reference interchangeably Fundamentals of Python: From First Programs Through Data Structures

  37. Defining a Singly Linked Node Class • Node classes are fairly simple • Flexibility and ease of use are critical • Node instance variables are usually referenced without method calls, and constructors allow the user to set a node’s link(s) when the node is created • A singly linked node contains just a data item and a reference to the next node: Fundamentals of Python: From First Programs Through Data Structures

  38. Using the Singly Linked Node Class • Node variables are initialized to Noneor a new Nodeobject Fundamentals of Python: From First Programs Through Data Structures

  39. Using the Singly Linked Node Class (continued) • node1.next = node3 raises AttributeError • Solution: node1 = Node("C", node3), or Node1 = Node("C", None) node1.next = node3 • To guard against exceptions: if nodeVariable != None: <access a field in nodeVariable> • Like arrays, linked structures are processed with loops Fundamentals of Python: From First Programs Through Data Structures

  40. Using the Singly Linked Node Class (continued) • Note that when the data are displayed, they appear in the reverse order of their insertion Fundamentals of Python: From First Programs Through Data Structures

  41. Operations on Singly Linked Structures • Almost all of the operations on arrays are index based • Indexes are an integral part of the array structure • Emulate index-based operations on a linked structure by manipulating links within the structure • We explore how these manipulations are performed in common operations, such as: • Traversals • Insertions • Removals Fundamentals of Python: From First Programs Through Data Structures

  42. Traversal • Traversal: Visit each node without deleting it • Uses a temporary pointer variable • Example: probe = head while probe != None: <use or modify probe.data> probe = probe.next • Noneserves as a sentinel that stops the process • Traversals are linear in time and require no extra memory Fundamentals of Python: From First Programs Through Data Structures

  43. Traversal (continued) Fundamentals of Python: From First Programs Through Data Structures

  44. Searching • Resembles a traversal, but two possible sentinels: • Empty link • Data item that equals the target item • Example: probe = head while probe != None and targetItem != probe.data: probe = probe.next if probe != None: <targetItem has been found> else: <targetItem is not in the linked structure> • On average, it is linear for singly linked structures Fundamentals of Python: From First Programs Through Data Structures

  45. Searching (continued) • Unfortunately, accessing the ith item of a linked structure is also a sequential search operation • We start at the first node and count the number of links until the ith node is reached • Linked structures do not support random access • Can’t use a binary search on a singly linked structure • Solution: Use other types of linked structures Fundamentals of Python: From First Programs Through Data Structures

  46. Replacement • Replacement operations employ traversal pattern • Replacing the ith item assumes 0 <= i < n • Both replacement operations are linear on average Fundamentals of Python: From First Programs Through Data Structures

  47. Inserting at the Beginning • Uses constant time and memory Fundamentals of Python: From First Programs Through Data Structures

  48. Inserting at the End • Inserting an item at the end of an array (appendin a Python list) requires constant time and memory • Unless the array must be resized • Inserting at the end of a singly linked structure must consider two cases: • Linear in time and constant in memory Fundamentals of Python: From First Programs Through Data Structures

  49. Inserting at the End (continued) Fundamentals of Python: From First Programs Through Data Structures

  50. Removing at the Beginning Fundamentals of Python: From First Programs Through Data Structures

More Related