170 likes | 352 Views
JETT 2005 Session 5: Algorithms, Efficiency, Hashing and Hashtables. Today’s buzzwords. Algorithm A strategy to solve a problem A systematic approach that describes the solution process Complexity and Efficiency How does an algorithm scale when the input size grows?
E N D
JETT 2005 Session 5:Algorithms, Efficiency, Hashing and Hashtables
Today’s buzzwords • Algorithm • A strategy to solve a problem • A systematic approach that describes the solution process • Complexity and Efficiency • How does an algorithm scale when the input size grows? • How do two algorithms solving the same problem compare? • Big Oh Notation • A theoretical measure for the execution of an algorithm, in terms of the problem size n – usually the number of items. • Hashing function • A function that takes an object and generates a number (or some form of an address) to a location where the object should be placed • Hash Table • A data structure that stores items in designated places using hashing functions for speeding up key-based search
Algorithms • Every problem can be solved if given enough computing power • But it doesn’t hurt to use as little as you can • Problems do not have to be boring crunching numbers stuff • Solutions are almost always elegant • A good algorithm uses less computing resource (less memory, less CPU time) • Strangely, may not always result in more lines of code!
So – lets solve a fun problem! How would you take the mouse to the cheese?
An algorithm to solve this problem: • First, read in the maze and populate a 2D array of Rooms • Next, create an instance of a stack. • Place the Room/Location corresponding to the mouse position in the stack. • Now use the following strategy. Peek at the top room of the stack. If this room is not visited, do the following: • Mark it to be visited. If the room location is the same as the cheese location, then you are done. If not, find all the rooms that are accessible from this top room that are not visited (you can use a specific order, or random order - it does not matter). • Push all the neighboring un-visited rooms in the stack. • If the top room is already visited, pop it off the stack. Continue with Step 4. • If the stack becomes empty, guess what? There is no solution to the problem.
A data structure to solve these problems: Game trees Start goal
Algorithms to solve the problems • Easiest: Depth First search • Simplest, guaranteed to find a solution • Easily implemented with a stack as we saw • Inefficient – lot of backtracking, will potentially visit all nodes! • Pruning trees • Each node is given a weight based on how close it is to goal • Idea is to pick a node with the highest weight • Creating the tree is more difficult • May not always find the “best” solution
Complexity and Efficiency of Algorithms • Complexity is not: • How hard the algorithm is to implement • How many lines of code it takes to implement it • Complexity is: • How many operations the algorithm does to solve the problem • How much resources it takes to solve the problem • A program is more complex (less efficient) if: • It performs more operations • Takes more time and memory • Scales worse as input size grows
How is Efficiency Measured? • Big Oh Notation: • A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. • Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n). • The notation is read, "f of n is big oh of g of n".
So what does that mean? • Two algorithms with the same efficiency in Big Oh notation will scale the same way • There may be a slight difference in actual running times. • For example, Bubblesort and Insertionsort are both O(n2), but Insertionsort on a given size input typically is faster • However, as the input size grows, both algorithms show the same increase in runtime • Provides a means to compare algorithms
Hash Table – the most efficient data structure for Collections • Start with an array that holds the hash table. • Use a hash function to take a key and map it to some index in the array. This function will generally map several different keys to the same index. • If the desired record is in the location given by the index, then we’re finished, otherwise we must use some method to resolve the collision that may have occurred between two records wanting to go to the same location. • This process is called hashing. To use hashing we must • find good hash functions • determine how to resolve collisions
Clustering: a b c d e f a b c d e f a b c d e f Collision Resolution with Open Addressing • Linear Probing: Linear probing starts with the hash address and searches sequentially for the target key or an empty position. The array should be considered circular, so that when the last location is reached, the search proceeds to the first location of the array.
Collision Resolution with Open Addressing (Contd.) • Quadratic Probing: If there is a collision at hash address h, quadratic probing goes to location h+1, h+4, h+9,…, that is, at locations h+ i2 for i=1,2,... • Other Methods: • Key-dependent increments; • Random probing
Birthday Surprise: How Collisions are Possible? • If 24 or more randomly chosen people are in a room, what is the probability that two people in the class have the same birthday? • For hashing, the birthday surprise says that for any problem of reasonable size, collisions will almost certainly occur.
Moral of the Story • Data Structures provide the vehicle for problem solving • The algorithm is the route from the source to the destination • Efficiency is the time you take to make the trip! • An interstate is more “efficient” than a state route