640 likes | 737 Views
Data Structures. Lessons 9, 10 , 11, & 12. Overview. Basic Concepts Sorting Techniques Stacks Queues Records Linked Lists Binary Trees. Basic Concepts. Whether using arrays or lists…there has to be some way to: Search the data structure Sort the data structure
E N D
Data Structures Lessons 9, 10, 11, & 12 COP1000
Overview • Basic Concepts • Sorting Techniques • Stacks • Queues • Records • Linked Lists • Binary Trees COP1000
Basic Concepts Whether using arrays or lists…there has to be some way to: Search the data structure Sort the data structure To do this, we use keys… COP1000
Keys • Should be: • unique, • short, • easy to understand, • easily recognizable, and • have some inherent value** • Some debate about this feature COP1000
Keys cont. • Can be used in searching for a record and for sorting records within a data structure such as arrays or files. • Ex. There “should be” only one of each: • Social Security Number • Driver’s License Number • Birth Certificate ID COP1000
Sorting COP1000
Sorting • Where the elements are placed in some particular order. • Sort order can be in by • Ascending or (A-Z; Smallest to Largest) • Descending (Z-A; Largest to Smallest) COP1000
Types of Sorts • Bubble (slow!) • Selection Sort or Exchange Sort • Insertion Sort • Merge Sort • Quick Sort • Radix Sort • Shell Sort (variation of Insertion Sort) COP1000
Two Simple Concepts • Compare • A comparison is made between two pieces of data upon which a decision is made to move the data or not. • Exchange • An exchange is each time a piece of data is switched with another piece of data. • The Swap COP1000
The Swap Routine Exchanges Two Values Private Sub Swap(Array(J), Array(J+1)) If Array(J) > Array(J+1) Then Temp = Array(J) Array(J) = Array(J+1) Array(J+1) = Temp End If End Sub • Use the Call statement to access the Swap Routine. Element Index COP1000
Bubble Sort • One of the simplest to understand • The Concept: • Lower numbers “float” to the top of the array and larger numbers “sink” to the bottom of the array. 1 2 3 4 5 6 COP1000
The Process • Successively exchanges adjacent pairs of elements in a series of passes, repeating the process until the entire sequence is in sorted order. • Each pass starts at one end of the array and works toward the other end, with each pair of elements that are out of order being exchanged. • The entire sequence considers n-1 pieces of data • With each succeeding pass one less piece of data than the previous pass needs to be considered. COP1000
A Bubble Sort • 390205205205205 • 205 390182182182 • 182 182390 45 45 • 45 4545390235 • 235235235235 390 Worst Case Exchanges = 1/2 n(n - 1) = 1/2 5(5 - 1) = 10 Compares = 1/2 n(n - 1) = 1/2 5(5 - 1) = 10 Best Case Exchanges = 0 Compares = n - 1 COP1000
The Good and The Bad • Advantage: • If no exchanges are made during the first pass, the sequence is already in sorted order. • Disadvantage: • Is one of the slowest sorting algorithms and is probably only used because its logic is easily understood. COP1000
Pseudocode Example Get Array (or List) Input For Index = 1 to ListLength Input Value into Array(Index) Next Index ‘Then Sort For I = 1 to ListLength For J = 1 to ListLength – 1 If Array(J)>Array(J+1) Then Call Swap (Array(J), Array(J+1)) Next J Next I Call to Swap isnow insidethe IF statementrather than the IFStatement being inside the Swap Procedure COP1000
The Revised Better Swap Routine Exchange Two Values Private Sub Swap(Array(J), Array(J+1)) Temp = Array(J) Array(J) = Array(J+1) Array(J+1) = Temp End Sub Use the Call statement to access the Swap Routine. Swap COP1000
The Selection Sort • A rearrangement of data such that the data are in increasing (or decreasing) sequence. The Algorithm for a Selection Sort For Index 1 to ListLength-1 do Find the position of the smallest element in list[1..ListLength]. If List(Index) is not the position of the smallest element then Exchange the smallest element with the one at position List(Index) Next Index COP1000
The Process • Selects the smallest (or largest) element from a sequence of elements. • The values are moved into position by successively exchanging values in a list until the sequence is in sorted order. • Only desirable property? Records of successively smaller keys are identified one by one, so that output of the sorted sequence can proceed virtually in parallel with the sort itself. COP1000
The Selection Sort • 39045 45 45 45 45 • 205 205182182 182182 • 182182205205205205 • 45 390 390 390235235 • 235 235 235 235390390 Exchanges = n - 1 = 5 - 1 = 4 Compares = 1/2 n(n - 1) = 1/2 5(5 - 1) = 10 Note: 390 is not included as it is the last item in list. COP1000
The Good and The Bad • Advantages • Easiest to remember • The only desirable property is that records of successively smaller keys are identified one by one, so that output (or processing) of the sorted sequence can proceed virtually in parallel with the sort itself. • Disadvantages • Still slow COP1000
The Insertion Sort • Inserts each element into a sequence of sorted elements so that the resulting sequence is still sorted. • With arrays, a new array is used to insert the values from the old array • On average, half of the array will have to be compared. • With lists, a new list is created from the values of the old list. COP1000
The Insertion Sort Old List • 390390 390390 45 • 205 205205 45 182 • 18218245182 205 • 4545182205235 • 235235235 235 390 New List Exchanges = n - 1 = 5 - 1 = 4 Compares = 1/2 n(n - 1) = 1/2 5(5 - 1) = 10 COP1000
Sort Comparison Chart COP1000
Searching ? ? ? ? ? COP1000
Sequential Search • Searches the list for a specific item. • If the list is not ordered, the entire list must be searched before a conclusion may be made that the item is not in the list. • If the list is ordered, the list is searched only until a value is found that is larger than the search item. COP1000
Searching an Unordered List function ItemSearch (List : ListType; Item : ComponentType ): Boolean; var Index : Integer; begin Index := 1; List.Items[List.Length+1] := Item; while List.Items[Index] <> Item do Index := Index + 1; ItemSearch := Index <> List.Length + 1 end; Pascal Code COP1000
Searching an Ordered List function SeqSearch (List : ListType; Item : ComponentType ): Boolean; var Index : Integer; Stop : Boolean; begin Index := 1; Stop := False; {Initialize} List.Items[List.Length+1] := Item; While Not Stop Do {Item is not in List.Items[1]..List.Items[Index-1]} If Item > List.Items[Index] then Index := Index + 1 Else Stop := True; {Item is either found or not there} SeqSearch := (Index <> List.Length + 1) and (Item = List.Items[Index]) end; Pascal Code COP1000
The Binary Search • Processes the list by dividing the list and then searching each half. • List must be sorted. • Much more efficient that a sequential search. • In other words, a search of a 1000 element array (or list) would only take 10 compares opposed to 1000 using a sequential search COP1000
The Concept • Divides the List into 3 components • List[1..Middle-1] • List[Middle] • List[Middle+1..Last] [First] [Middle] [Last] Item COP1000
The Recursive Pseudocode Algorithm Compute the subscript of the middle element. If the Target is the middle value Then Middle value is target location Return with success ElseIf the Target is less than the middle value Then Search sublist with subscripts First..Middle-1 Else Search sublist with subscripts Middle + 1..Last End If COP1000
Code Example procedure BinarySearch (var List {Input} : IntArray; Target {Input} : Integer; First, Last {Input} : Integer; var Index {output} : Integer; var Found {output} : Boolean); var Middle : Integer; begin Middle := (First + Last) div 2; if First > Last then Found := False else if Target = List[Middle] then begin Found := True; Index := Middle end else if Target < List[Middle] then BinarySearch (List, Target, First, Middle-1, Index, Found) else BinarySearch(List, Target, Middle+1, Last, Index, Found) end; Pascal’s Integer Division Found the item Pascal Code COP1000
Linked Structures COP1000
The Concept • Each data structure element contains • not only the element’s data value but • also the addresses of one or more other data elements. • Examples: • Stacks • Queues • Trees “I Love Trees…” COP1000
Linked List • Probably the simplest linked structure • Contain records • Each element contains the address of the next list element. • Are extremely flexible. • They make it easy to add new information by creating a new node and inserting it between two existing nodes. • It is also easy to delete a node. COP1000
Node Node Node nil Linked List Pointer Pointer Pointers • A data type whose values are the locations of values of other data types and are stored in memory. • Considered a Referenced Variable • A variable created and accessed not by a name but by a pointer variable -- a dynamic variable. COP1000
How the Link Works • Linked Lists • A connected group of dynamically allocated records. • Nodes • Records within a linked list. key data Instance of Node P COP1000
Conceptual View of a Simple Linked List As silly as it sounds…You always know where your head is… head current nil current node first node COP1000
Link Operations • List Head • The first node in a list. • Inserting at the Head of a List • Is more efficient and easier • Insertion at the End of a List • Less efficient because there is no specific pointer to the end of the list. • The list must be followed from the head to the last list node and then perform the insertion. COP1000
Link Operations cont. • Deleting a Node • Change the Link field of the node that points to its predecessor and point to the node’s successor. • Traversing a List • Processing the nodes in a list starting with the list head and ending with the last node following the trail of pointers. • Head <> nil is typical for processing loops that process lists. COP1000
Dynamic Structures COP1000
Stacks • Is a data structure in which only the top element can be accessed. • Classic Example: • Plates in a buffet line. • Customer always takes the top-most plate. • Plates are replaced from the top. • LIFOLast-In First-Out Structure • Last element stored is the first to be removed. COP1000
head pointer head pointer 4 3 2 1 4 3 2 1 a stack a popped stack head pointer 5 4 3 2 1 a pushed stack Push & Pop • Pushing Onto The Stack • Placing a new top element on the stack. • Popping The Stack • Removing the top element of a stack. COP1000
Queues • A data structure in which elements are inserted at one end and removed from the other end. • Classic Example: • Customers in a Theater Ticket Line or a list of jobs waiting to be executed. • FIFOFirst-In First-Out Structure • First element stored is the first to be removed. • Also Array Queues, Priority Queues, and Schedule Queues. COP1000
tail pointer tail pointer dog head pointer head pointer an empty queue after enqueuing an element tail pointer cat dog tail pointer cat head pointer head pointer after enqueuing another element after dequeuing an element Enqueue & Dequeue COP1000
Trees • Similar to a linked list, except that each element carries with it the addresses of 2 or more other elements, rather than just one. COP1000
Binary Trees • Similar to a linked list, except that each element carries with it the addresses of 2 or more other elements, rather than just one. So… Why is the treeupside down? COP1000
Binary Trees • Contains at most two subtrees(or two children). • Each subtree is identified as being either the left subtree or the right subtree of its parent. • It may be empty (a pointer with no successors). • Each node in a binary tree can have 0, 1, or 2 successor nodes. COP1000
Some Terms • Root– a binary tree with at least one node at the top. • Leaf Node – the nodes at the bottom of a binary tree node with zero successors. • Left and Right subtrees– the two disjoint binary trees attached to the root of a binary tree. • Disjoint subtrees– nodes cannot be on both a left and right subtree of the same node. COP1000
More Terms • Parent-child relationship – the relationship between a node and its successors. • Parent– the predecessor of a node. • Child – the successor of a node. • Edge – line that connects two nodes • Siblings – two children of the same parent node. COP1000
More Terms • Ancestors – all predecessors of a node, unless it is the root. The root has no ancestors. • Descendants – all successors of a node. • Balanced, Minimal Path – the difference between any two paths is at most 1. • So…how are these terms used? COP1000