220 likes | 315 Views
CS120: Lecture 9. MP Johnson Hunter mpjohnson@gmail.com. Next topic: algs Hw2 due Wed MT Thursday No unexcused absences Next after that: JavaScript Next week’s lab: JS & advanced HTML For Tues: read handout on stable marriage. Defs Complexity ftns Probs: Add nums Mult nums Gcd
E N D
CS120: Lecture 9 MP Johnson Hunter mpjohnson@gmail.com
Next topic: algs Hw2 due Wed MT Thursday No unexcused absences Next after that: JavaScript Next week’s lab: JS & advanced HTML For Tues: read handout on stable marriage Defs Complexity ftns Probs: Add nums Mult nums Gcd Sorting Binary search Stable marriage Prefix-sum Poly eval Towers of Hanoi Order-of-magn calcs Agenda: Algs
Hist, motiv • Def: ordered set of precise instructions, executable steps that defines a finite process • You write programs, which implement some alg • Etym: Persian math. Al-Khowarizmi • Arabic al-khuwarizmi Medieval Latin algorismus Middle English algorisme algorithm • Also: title of book algebra
def • Alg: idea behind program • seq of steps for solving some prob • finite • well defined • each step tractable • (step 3 can’t be “compute Pi exactly” or “find meaning of life”) • alg is a mapping: • A:{instances} {outputs} • outputs could be bits/decisions: prime or not? or larger structure: position of number; sorted seq • intuition: what stays the same when prog is ported from C to J to Basic to Assembly
To be an alg • Each step must be unambiguous • Can be converted to machine instructions • Each step must be doable • Can’t say “calc Pi exactly” or “print meaning of life” • Must be finite • Description is finite • For each poss. input, must eventually stop
Alg goals • ideally, alg should be: • Correct • efficient • or at least optimal • each can be difficult • sometimes: trade-off between difficulty of each • easy-to-understand alg may be slow • fast alg may be difficult to understand (or to find!) • not really surprising: “obvious” soln often very slow
To be a good alg • Also: want algorithm to be fast • In a precise sense… • Should scale well on different inputs • Shouldn’t take too much storage space • Should be easy to understand/code
Alg v. program • A program implements an alg • Program v. alg v. process • Book analogy: • Story:book :: alg:program • Story can be translated between langs, printed in hardcover of paperback • Alg can be translated between langs, run on Win or Mac • The alg is what stays the same when it’s ported
Algs you already know • Dial your phone number • Combination lock • Cooking • Add numbers in memory: • Fixed-length alg • Addition itself (of bitstrings) • Length depends on bitstring size
Algs you already know 6. Long additon, etc. (base-10 ops) • Let rightmost column be current • Add the digits in the current col • Write first dig of result as answer for that col • If result >9, add 1 to previous digits (“carry”) • Move “current column” to left • Go to step 2 7. Processor’s alg: • Fetch an instruction • Decode it • Execute it • Go to step 1
Problem-solving • Programming can be hard (fixing bugs) • But deciding what to program (finding the alg) can often be harder • Given problem description, must find some (good) alg • No general-purpose method that always works • Trial&error, experience • Would like: an alg to produce alg, but there isn’t one
TSP: Hilary motiv • algs are designed to solve problems, in all cases • should work correctly (quickly) on all possible inputs • eg: say Hilary has to campaign in N cities • maybe all state caps • maybe a subset • Hilary needs to save gas • no Air Force One • needs optimal route to: visit each stop, return to starting place • Washington? Boston? • Hilary can look at the map • use geometric reasoning • intuition based on experience • very sophisticated • hard – see an ML class • Hilary’s intuition may work for small cases • for larger ones, very likely will be highly suboptimal • we want an alg: guaranteed to always return a best route • may not be unique
TSP: Hilary motiv • Where does she start? • NYC 2. Now what? • Obvious choice: jump to closest city • Say, Bos 3. Now what? • Again, jump to closest (unvisited) city • Thus ag: 4. Pick arbitrary p0; Set I = 1; While (I < n) Set pi = closest new city to p(i-1)
TSP: Greedy alg • We’ve stumbled onto the “greedy algorithm” for this prob • many other probs have “Greedy algs”: alg does maximal optimization at each step • an. To short-term self interest v. long-term self-interest • this greedy alg has much to recommend it: • very easy to understand • very easy to code up • pretty efficient • for each next move, just choose among those remaining • only problem: it’s wrong • i.e., the resulting path won’t always be the shortest one possible
TSP: Greedy alg • consider this case: (sequence of dots) • greedy produces this • clearly optimal is that • eg: • NYCBos 215 • NYCDC 232 4. prob not just all in a line • Prob also occurs for T-shaped • Anytime something like this as a subgraph • just one example 5. what happened? Too short-sighted • cities that were next to each other were separated on the tour • should have been kept together
TSP: Joining pairs • idea: join together near pairs, then do greedy on results • maybe this will work? • Apply this strategy to line of dots produces optimal • What about to this eg: Grid of dots • wider than tall • suboptimal again! • Clearly better tours exist: … • This alg didn’t work either • Intuition: writing algorithms = raising children
TSP: Brute force • So what does work? • Let’s slow down • Recall: wanted best ordering • How many orderings? Only so many • Following must work: run through all possible, measure each one • Code: • Best = -1 Cost = infinity I = 0 While (I < n) If (best == -1 or cost[i] < cost[best]) Best = I; Return best;
TSP: Brute force • Correct? • yes, must be • efficient? • NO, very inefficient • number of steps roughly n! • very bad: 5! = 120, 10! = 3,628,800, 20! = 10^18, 33! = 8^36 3. Q: can we do better? • Well, somewhat—basically clever software engineering, using info from special cases, clever coding 4. But no essentially faster alg • There’s no fast alg (so far as anyone knows!) • No proof of its nonexistence, but would be a huge surprise • In other cases, we’ll have better luck
Measuring running time N n^2 2^n 5 10 20 • Logn • N • N^2 • N^k • 2^n • N! • O(f(n)) – “big O”, Th(f(n)) – theta • Big-O matters, coeffs don’t
Log review • df: logbX = y b^y = x • intuit: Y is number of b’s multiplied together yield X • i.e., log is inverse of exponentiation function • Log2(x)= y 2^y = x • 2^10 = 1024 = k log1024 • 2^8 = 256 … • 2^16 = 64k • 2^20 = M • 2^30 = G
Sequential search Alg? Cmplx? Binary search Alg? Analogy: phonebook Cmplx? Q: how many times to /2? N logn 1024 … search