370 likes | 605 Views
CS305/503, Spring 2009 Linked Lists, Stacks, and Queues. Michael Barnathan. On the Exams. The midterm date on the syllabus is March 3. Does this work for everyone? We can easily change it. The exams will be mostly conceptual.
E N D
CS305/503, Spring 2009Linked Lists, Stacks, and Queues. Michael Barnathan
On the Exams • The midterm date on the syllabus is March 3. • Does this work for everyone? • We can easily change it. • The exams will be mostly conceptual. • The assignments, labs, and project are already assessing your programming enough. • Exams will cover topics such as arrays, sorting, recursion, and linked lists in the abstract. • Emphasis will be on understanding rather than memorized facts. • The review questions in each lecture are good preparation. • The exams will be open book, but no talking. • There will be a review the class before the exam.
Review: Recursion • What is recursion? • What sorts of problems does recursion solve? • What are the two essential components of a recursive algorithm? • What is a subproblem? • What do we do with the subproblems once we solve them? • What is one potential problem with the divide and conquer approach? • What is memoization and when can it help us? • Why did memoization cut down the complexity of the Fibonacci algorithm so dramatically?
Review: Linked Lists • Quick (20 min.) ungraded lab exercise: • Write a LinkedListNode class with: • An integer value. • A “next” LinkedListNode reference. • Create a linked list using your class with all consecutive integers from 1 to 100: • Now print them out in reverse order. • In O(n) time. • Don’t add a “prev” reference. • Don’t declare any additional data structures. • Hint: You can do it using recursion. We did it on Thursday. 100 1 2 3 …
Here’s what we’ll be learning: • Data Structures: • Linked Lists. • Stacks. • Queues. • Dequeues. • Priority Queues. • Theory: • Stacks in recursion.
Reminder: Linked Lists. • Insertion: O(1) • Access: O(N) • Updating an element: O(1) • Deleting an element: O(1) • Search: O(N). • Merge: O(1). • Dynamically sized by nature. • Just stick a new node at the end. • Modifications are fast, but sequential node access is the killer. • And you need to access the nodes before performing other operations on them. • Three main uses: • When search/access is not very important (e.g. logs, backups). • When you’re merging and deleting a lot. • When you need to iterate through the list sequentially anyway.
Recursive Definition of Linked Lists • Just like arrays, a linked list of size n is a linked list of size n-1 plus a node. • This is a pretty common definition for “linear” data structures such as arrays and linked lists. • Note that even though the recursive definitions are the same, arrays and lists still have different properties.
Searching and Sorting on Lists. • Sequential access causes problems in our partition-based algorithms. • You can’t perform binary search. • Moving to the new middle is linear. • Likewise, don’t use a list for the guessing game. • All of the basic sorting algorithms we’ve learned can be made to work on linked lists. • In general, the constant-time modification speeds the algorithms up, but the search behavior slows it down. • The general runtime ends up the same. • Faster sorting algorithms have problems, however. • We’ll talk about them soon.
Any way to improve this? • Insertion and deletion are constant time. • But accessing the node to be deleted/inserted after in the first place is linear. • So it’s really the middle of the list that has problems. We have direct pointers to the ends. • When faced with problems of this type, ask yourself “do we need this much power?” • If the answer is no, restrict your data structures for better performance.
Restricting to the ends. • Stacks, queues, and dequeues are data structures that restrict insertion, deletion, and access to the end(s) of the structure. • The primary difference between them is which end(s) operations are performed on. • These structures are often built on top of linked lists. • Through encapsulation, we can use a LinkedList as a low-level structure but restrict the high-level operations to the end of the structure.
Stacks • Stacks are like stacks of dishes. • You can only add one to the top of the stack. • You can only take one off of the top of the stack. • You can’t even look at the dishes in the middle. • If you tried to add one in the middle, you’d need to set aside all of the dishes above it, add the new dish, then add the old dishes back on top of it. • Same with removal; if you just yanked a dish out of the middle, you’d get porcelain all over the floor. • You can only operate directly on the “top” element of a stack.
Terminology • Adding an element to the top: “push”. • Removing an element from the top: “pop”. • Most recently inserted element: “top”. • Access the top value without removing it: “peek”. • “push” and “pop” are sometimes used in other data structures as well. • They simply mean “add to/remove from the front” • Java and the STL have push(), pop() on most container classes, including Vector and LinkedList.
LIFO • Stacks exhibit what is called “last-in-first-out” (LIFO) behavior. • You add an element at the top of a stack. • If you were to then remove an element, it would be the one you just added – the last element you inserted. • The last element to go “into” the stack is the first element to be taken “out of” the stack. • Tip: You can reverse sequences of things this way. • Sound familiar?
Example: 50 top top top 42 42 Pop. Push 50. rest of stack (inaccessible)
Stacks in Recursion • The system actually maintains function calls in a stack. • That includes all of the parameters of the function. • When you call a recursive function, say the printTo function we discussed last class, you have an argument named “int n”. • When it calls printTo(n-1), you invoke another function with its own copy of “int n”. • And so on. • When you use “n”, you are accessing the top of the stack. • When you call printTo(n-1), you are pushing n-1 onto the stack. • When the function call exits, its “n” is popped from the stack. • So what printTo() is doing is generating a stack of numbers from 1 to n, then outputting the top element right before it’s popped. • That is why we were able to reverse the order by moving the print statement above the recursive call. • We were outputting the top element right after it was pushed. • Since stacks are LIFO, we pushed in descending order, but popped in ascending order.
Example: printTo(): Pushing in descending order, popping in ascending order. 1 2 System.out.println() above recursive call: printing on push. n-2 n-1 n-1 … … n n n n System.out.println() below recursive call: printing on pop.
CRUD: Stacks • Push: ? • Pop: ? • Peek: ? • Search: ? • Any ideas? • Assume the underlying representation is a linked list. • The performance is actually the same if you use a vector underneath.
CRUD: Stacks • Push: O(1) • Pop: O(1) • Peek: O(1) • Search: O(n) • Pushing, popping, and peeking are insertion, deletion, and access at the end of a linked list. • To search, you need to pop values one by one, check them, then push them back on. • You can store them in another stack to avoid reversing the order (or, more accurately, to reverse the order twice).
The System Stack and Exceptions • Note that this stack is never declared. • It’s automatically and transparently generated for you by the system. • This is just how it handles function calls. • When a program throws an exception, you get a “trace” of this stack. • All of the functions called and what lines in each the exception passed through. • Exceptions pass up the stack until they are either caught or until they pass main(). • You can catch exceptions at any level at or above the caller. • For example, my EmployeeLoader class threw a FileNotFoundException in its constructor. You caught it in main(). • If you don’t catch the exception when main() exits, the Java runtime will catch it, output an error trace, and terminate. • This is called stack unwinding. It happens in C/C++ too.
Queues • Similar to stacks, except you insert at the back and remove at the front. • Think of it as a real queue… waiting on line at a checkout counter, for example. • New people are added to the back of the line. • They leave from the front.
Terminology • Enqueue (pronounced “N Q”): insert into the back of the queue. • Dequeue (pronounced “day Q” or “D Q”): remove from the front of the queue. • Back: whichever end you insert at. • Front: whichever end you remove from. • Back != Front (otherwise it’s a stack).
FIFO • Queues are first-in-first-out (FIFO). • Also referred to as first-come-first-served (FCFS). • The first element inserted into the queue is the first one that will leave. • Example: 200 people waiting on line for Wiis. • The ones who camped at the store the previous night are the ones who will get them first. • The people who showed up later will have to wait for the others.
Example: Enqueue 1, 2, 3. Dequeue thrice. 1 goes in first, comes out first. 2 goes in second, comes out second. 3 goes in third, comes out third. FIFO! front front 1 1 2 2 1 3 2 3 3 back back
CRUD: Queues • Enqueue: O(1) • Dequeue: O(1) • Peek: O(1) • Search: O(n) • Since all we’ve done is change the end we add to, the performance remains the same.
Deques: • “Doubly-ended queues”. • Sometimes spelled “dequeue”, but that’s confusing because deletion from a queue is also called that. • Usually pronounced “deck” or “day Q”. • This is just a queue where you can add and remove at both ends. • It’s up to you whether to treat them as stacks or queues in your program. • Because these aren’t necessarily LIFO or FIFO unless you use them that way, they’re not commonly used. • If you want LIFO behavior, you can use a stack. • If you want FIFO behavior, you can use a queue.
Deques: Terminology • Push_front: Add an element to the front end. • Push_back: Add an element to the back end. • Pop_front: Remove the front element. • Pop_back: Remove the back element. • Java and C++ provide these functions for most linear data structures.
CRUD: Deques • Push front/back: O(1) • Pop front/back: O(1) • Peek: O(1) • Search: O(n) • Again, nothing is really changing here.
Priority Queues • Like queues, but some people are more important and get to cut the line. • Imagine you’re waiting for that Wii and Bill Gates walks in. He walks straight up to the cashier, buys it, and leaves. • Your first thought would probably be “wow, even Bill Gates has no confidence in the Xbox 360”. • But your second would probably be “hey, he just cut the whole line!” • Yes, because Bill Gates is more important than you. • Sorry, sorry. But you can be better programmers than he is.
Priority Queues • Humor aside, this is how priority queues work. • Every element has a value and a priority. • The element with the highest priority is always the next one to be removed from the queue. • This is no longer FIFO or LIFO. • The highest priority in is now first out. • Obviously, guaranteeing this requires some work, either on insertion or retrieval.
Priority Queues • Are useful data structures: • Most CPU schedulers use them. • Print queues can use them. • Elevators can use them. • Businesses can use them to model their processes and risks. • Testers can use them to categorize bugs. • They are appropriate whenever certain elements should be prioritized over others.
The Insertion Strategy: • One way to implement a priority queue is to use an array or linked list underneath and insert elements into it sorted by priority. • This guarantees that the element at the front of the queue is the one with highest priority. • This incurs a cost: • For arrays, finding the place to put the element is O(log n), but shifting the elements over is O(n). • For linked lists, insertion is O(1), but finding the right place to insert into is O(n). • Either way, this requires O(n) time per insertion. • On the other hand, it only takes O(1) to dequeue.
The Selection Strategy: • Another approach is to keep the array/list in unsorted order and find the right element when we dequeue. • Insertion then becomes O(1). • But access is then O(n). • You have to search through the array linearly. • Binary search cannot be used here, since the structure is unsorted.
Queues and Sorts • In either case, inserting elements into a priority queue then removing them sorts them by priority. • The insertion and selection strategies are analogous to their respective sorts. • In a selection-based queue, you must find the maximum priority item and return it as if it were at the end of the queue, just as you found the minimum value and swapped it to the end in selection sort. • In an insertion-based queue, you must insert the element in its proper position, shifting elements beyond it over (in an array). • There is another implementation of a priority queue using a data structure called a heap. • And consequently, another sorting algorithm, called heapsort. • We will cover this when we get to heaps.
CRUD: Priority Queues • Enqueue (selection): O(1). • Enqueue (insertion): O(n). • Dequeue (selection): O(n). • Dequeue (insertion): O(1). • So you either pay on insert or access. For now, that’s your tradeoff. • The heap strategy is a compromise: • O(log n) for both. • But not really. • O(log n) is far better than O(n).
Common Applications • Stacks: • The system stack keeps track of function calls. • Pointers to free spaces on disk and memory in the OS are often accessed like stacks. • You can implement a whole system using just a stack and a tape drive. • Really. That’s what a Turing machine is. • Parsers make use of these. • Particularly a type of parser known as a “pushdown automaton”. • These get used a lot in compilers. • They’re handy for storing and reversing lists of numbers. • Queues: • Used a lot in synchronization of event-driven or multithreaded apps. • Events stream in and get stored in a queue until the app can handle them. • Used to model “arrivals” in general. • Used in almost all CPU scheduling algorithms. • Used for buffering device I/O. • Used for print jobs. • Priority queues: • Used for everything queues are used for when priority is important, and then some.
(Push (Push (Push Pop) Pop) Pop) • This is all we will cover simple stacks and queues. They are fairly simple structures. • We will come back to priority queues when we learn about heaps. • The lesson: • It is often better to store things, prioritize them, and finish them one at a time than to attend to everything as it demands your attention. • Next class: Mergesort, Shellsort, Quicksort.
Assignment 2: • Using the EmployeeLoader class, write a program that groups employees (read from Employees.csv) by city. • Compute the average salary for each city. • Output the names and average salaries of the cities with the 25 highest average salaries in descending order. • You are effectively implementing the following SQL query: • SELECT City, AVG(Salary) FROM Employees GROUP BY City ORDER BY Salary DESC LIMIT 25 • Describe the data structures and algorithms you used and why you chose them. • Deadline: Tuesday, February 24.