390 likes | 403 Views
This lecture provides an overview of social algorithms, sorting and search algorithms, and the importance of analyzing algorithms. It discusses running time, functions and graphs, exponents and logarithms, and other mathematical concepts relevant to algorithm analysis.
E N D
i206: Lecture 6:Math Review, Begin Analysis of Algorithms Marti Hearst Spring 2012
Sorting algorithms Bubble sort Insertion sort Shell sort Merge sort Heapsort Quicksort Radix sort … Search algorithms Linear search Binary search Breadth first search Depth first search … Which Algorithm is Better?What do we mean by “Better”?
Analysis of Algorithms • Characterizing the running times of algorithms and data structure operations • Secondarily, characterizing the space usage of algorithms and data structures
Why Analysis of Algorithms? • To find out • How long an algorithm takes to run • How to compare different algorithms • This is done at a very abstract level • This can be done before code is written • Alternative: Performance analysis • Actually time each operation as the program is running • Specific to the machine and the implementation of the algorithm • Can only be done after code is written
Running Time • In general, running time increases with input size • Running time also affected by: • Hardware environment (processor, clock rate, memory, disk, etc.) • Software environment (operating system, programming language, compiler, interpreter, etc.) (n)
Functions, Graphs of Functions • Function: a rule that • Coverts inputs to outputs in a well-defined way. • This conversion process is often called mapping. • Input space called the domain • Output space called the range
Functions, Graphs of Functions • Function: a rule that • Coverts inputs to outputs in a well-defined way. • This conversion process is often called mapping. • Input space called the domain • Output space called the range • Examples • Mapping of speed of bicycling to calories burned • Domain: values for speed • Range: values for calories • Mapping of people to names • Domain: people • Range: Strings
Example: How many calories does bicycling burn? Miles/Hour vs. KiloCalories/Minute For a 150 lb rider. Green: riding on a gravel road with a mountain bike Blue: paved road riding a touring bicycle Red: racing bicyclist. From Whitt, F.R. & D. G. Wilson. 1982. Bicycling Science (second edition). http://www.frontier.iarc.uaf.edu/~cswingle/misc/exercise.phtml
Functions and Graphs of Functions • Notation • Many different kinds • f(x) is read “f of x” • This means a function called f takes x as an argument • f(x) = y • This means the function f takes x as input and produces y as output. • The rule for the mapping is hidden by the notation.
Here f(x) = 7x A point on this graph can be called (x,y) or (x, f(x)) http://www.sosmath.com/algebra/logs/log1/log1.html
A straight line is defined by the function y = ax + b a and b are constants x is variable Here y = 7x + 0 http://www.sosmath.com/algebra/logs/log1/log1.html
Exponents and Logarithms • Exponents: shorthand for multiplication • Logarithms: shorthand for exponents 23 = 8 can be expressed as log2 8 = 3 or we can say that the 'log with base 2 of 8' = 3. • How we use these? • Difficult computational problems grow exponentially • Logarithms are useful for “squashing” them
Logarithm lets you “grab the exponent” http://www.sosmath.com/algebra/logs/log1/log1.html
The exponential function f with base a is denoted by and x is any real number. Note how much more “quickly the graph grows” than the linear graph of f(x) = x Example: If the base is 2 and x = 4, the function value f(4) will equal 16. A corresponding point on the graph would be (4, 16). http://www.sosmath.com/algebra/logs/log1/log1.html
Logarithmic functions are the inverse of exponential functions. If (4, 16) is a point on the graph of an exponential function, then (16, 4) would be the corresponding point on the graph of the inverse logarithmic function. http://www.sosmath.com/algebra/logs/log1/log1.html
Zipf Distribution(linear and log scale) Illustration by Jacob Nielsen
Other Functions • Quadratic function: • This is a graph of: (2,0) (-2,0) http://www.sosmath.com/algebra/
Intuition behind n(n+1)/2 A visual proof that 1+2+3+...+n = n(n+1)/2 Count the number of dots in T(n) but, instead of summing the numbers 1, 2, 3, … up to n we will find the total using only one multiplication and one division! http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/runsums/triNbProof.html
Intuition behind n(n+1)/2 • A visual proof that 1+2+3+...+n = n(n+1)/2 Notice that we get a rectangle which is has the same number of rows (4) but has one extra column (5) so the rectangle is 4 by 5 it therefore contains 4x5=20 balls but we took two copies of T(4) to get this so we must have 20/2 = 10 balls in T(4), which we can easily check. http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/runsums/triNbProof.html
Intuition behind n(n+1)/2 • A visual proof that 1+2+3+...+n = n(n+1)/2 http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/runsums/triNbProof.html
Intuition behind n(n+1)/2 • T(n) = 1 + 2 + 3 + ... + n = n(n + 1)/2 • The same proof using algebra: http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/runsums/triNbProof.html
The Seven Common Functions • Constant • Linear • Quadratic • Cubic • Exponential • Logarithmic • n-log-n Polynomial
Function Pecking Order • In increasing order Adapted from Goodrich & Tamassia
Definition of Big-Oh A running time is O(g(n)) if there exist constants n0 > 0 and c > 0 such that for all problem sizes n > n0, the running time for a problem of size n is at most c(g(n)). In other words, c(g(n)) is an upper bound on the running time for sufficiently large n. c g(n) Example: the function f(n)=8n–2 is O(n) The running time of algorithm arrayMax is O(n) http://www.cs.dartmouth.edu/~farid/teaching/cs15/cs5/lectures/0519/0519.html
The Crossover Point One function starts out faster for small values of n. But for n > n0, the other function is always faster. Adapted from http://www.cs.sunysb.edu/~algorith/lectures-good/node2.html
More formally • Let f(n) and g(n) be functions mapping nonnegative integers to real numbers. • f(n) is (g(n)) if there exist positive constants n0 and c such that for all n>=n0,f(n) <= c*g(n) • Other ways to say this: f(n) is orderg(n) f(n) is big-Oh ofg(n) f(n) is Oh ofg(n) f(n) O(g(n)) (set notation)
Comparing Running Times Adapted from Goodrich & Tamassia
Analysis Example: Phonebook • Given: • A physical phone book • Organized in alphabetical order • A name you want to look up • An algorithm in which you search through the book sequentially, from first page to last • What is the order of: • The best case running time? • The worst case running time? • The average case running time? • What is: • A better algorithm? • The worst case running time for this algorithm?
Analysis Example (Phonebook) • This better algorithm is called Binary Search • What is its running time? • First you look in the middle of n elements • Then you look in the middle of n/2 = ½*n elements • Then you look in the middle of ½ * ½*n elements • … • Continue until there is only 1 element left • Say you did this m times: ½ * ½ * ½* …*n • Then the number of repetitions is the smallest integer m such that
Analyzing Binary Search • In the worst case, the number of repetitions is the smallest integer m such that • We can rewrite this as follows: Multiply both sides by Take the log of both sides Since m is the worst case time, the algorithm is O(logn)