1 / 40

CS 416 Artificial Intelligence

CS 416 Artificial Intelligence. Lecture 6 Informed Searches. A sad day. No more Western Union telegrams 1844 - First telegram (Morse) “What hath God wrought” 1858 - First transatlantic telegram from Queen Victoria to President Buchanan Break, break, break… 1866 working again

Download Presentation

CS 416 Artificial Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 416Artificial Intelligence Lecture 6 Informed Searches

  2. A sad day • No more Western Union telegrams • 1844 - First telegram (Morse) “What hath God wrought” • 1858 - First transatlantic telegram from Queen Victoria to President Buchanan • Break, break, break… 1866 working again • A few words per minute • Punctuation cost extra. “stop” was cheaper.

  3. Assignment 1 • Getting Visual Studio • Signing up for Thursday (3-5) Friday (2-3:30) • Explanation of IsNormal ( )

  4. A* without admissibility B 4 5 5 A D 9 0 4 C 5 200 Never explored!

  5. A* with admissibility B 4 5 5 A D 9 0 4 C 5 1

  6. Another A* without admissibility B 1 5 5 A D 9 0 4 C 5 6 Never explored!

  7. Admissible w/o Consistency B 2 0 1 2 A D 0 3 1 1 C 1 1

  8. Meta-foo • What does “meta” mean in AI? • Frequently it means step back a level from foo • Metareasoning = reasoning about reasoning • These informed search algorithms have pros and cons regarding how they choose to explore new levels • a metalevel learning algorithm may learn how to combine search techniques to suit application domain

  9. Heuristic Functions • 8-puzzle problem Avg Depth=22 Branching = approx 3 322 states 170,000 repeated

  10. Heuristics • The number of misplaced tiles • Admissible because at least n moves required to solve n misplaced tiles • The distance from each tile to its goal position • No diagonals, so use Manhattan Distance • As if walking around rectilinear city blocks • also admissible

  11. Compare these two heuristics • Effective Branching Factor, b* • If A* generates N nodes to find the goal at depth d • b* = branching factor such that a uniform tree of depth d contains N+1 nodes (we add one for the root node that wasn’t included in N) • N+1 = 1 + b* + (b*)2 + … + (b*)d

  12. Compare these two heuristics • Effective Branching Factor, b* • b* close to 1 is ideal • because this means the heuristic guided the A* search linearly • If b* were 100, on average, the heuristic had to consider 100 children for each node • Compare heuristics based on their b*

  13. Compare these two heuristics

  14. Compare these two heuristics • h2 is always better than h1 • for any node, n, h2(n) >= h1(n) • h2dominates h1 • Recall all nodes with f(n) < C* will be expanded? • This means all nodes, h(n) + g(n) < C*, will be expanded • All nodes where h(n) < C* - g(n) will be expanded • All nodes h2 expands will also be expanded by h1 and because h1 is smaller, others will be expanded as well

  15. Inventing admissible heuristic funcs • How can you create h(n)? • Simplify problem by reducing restrictions on actions • Allow 8-puzzle pieces to sit atop on another • Call this a relaxed problem • The cost of optimal solution to relaxed problem is admissible heuristic for original problem • It is at least as expensive for the original problem

  16. Examples of relaxed problems • A tile can move from square A to square B if • A is horizontally or vertically adjacent to B • and B is blank • A tile can move from A to B if A is adjacent to B (overlap) • A tile can move from A to B if B is blank (teleport) • A tile can move from A to B (teleport and overlap) • Solutions to these relaxed problems can be computed without search and therefore heuristic is easy to compute

  17. Multiple Heuristics • If multiple heuristics available: • h(n) = max {h1(n), h2(n), …, hm(n)}

  18. Use solution to subproblem as heuristic • What is optimal cost of solving some portion of original problem? • subproblem solution is heuristic of original problem

  19. Pattern Databases • Store optimal solutions to subproblems in database • We use an exhaustive search to solve every permutation of the 1,2,3,4-piece subproblem of the 8-puzzle • During solution of 8-puzzle, look up optimal cost to solve the 1,2,3,4-piece subproblem and use as heuristic

  20. Learning • Could also build pattern database while solving cases of the 8-puzzle • Must keep track of intermediate states and true final cost of solution • Inductive learning builds mapping of state -> cost • Because too many permutations of actual states • Construct important features to reduce size of space

  21. Local Search Algorithms andOptimization Problems

  22. Characterize Techniques • Uninformed Search • Looking for a solution where solution is a path from start to goal • At each intermediate point along a path, we have no prediction of the future value of the path • Informed Search • Again, looking for a path from start to goal • This time we have more insight regarding the value of intermediate solutions

  23. Now change things a bit • What if the path isn’t important, just the goal? • So the goal is unknown • The path to the goal need not be solved • Examples • What quantities of quarters, nickels, and dimes add up to $17.45 and minimizes the total number of coins • Is the price of Microsoft stock going up tomorrow?

  24. Local Search • Local search does not keep track of previous solutions • Instead it keeps track of current solution (current state) • Uses a method of generating alternative solution candidates • Advantages • Use a small amount of memory (usually constant amount) • They can find reasonable (note we aren’t saying optimal) solutions in infinite search spaces

  25. Optimization Problems • Objective Function • A function with vector inputs and scalar output • goal is to search through candidate input vectors in order to minimize or maximize objective function • Example • f (q, d, n) = 1,000,000 if q*0.25 + d*0.1 + n*0.05 != 17.45 = q + n + d otherwise • minimize f

  26. Search Space • The realm of feasible input vectors • Also called state-space landscape • Usually described by • number of dimensions (3 for our change example) • domain of each dimension (#quarters is discrete from 0 to 69…) • functional relationship between input vector and objective function output • no relationship (chaos or seemingly random) • smoothly varying • discontinuities

  27. Search Space • Looking for global maximum (or minimum)

  28. Hill Climbing • Also called Greedy Search • Select a starting point and set current • evaluate (current) • loop do • neighbor = highest value successor of current • if evaluate (neighbor) <= evaluate (current) • return current • else current = neighbor

  29. Hill climbing gets stuck • Hiking metaphor (you are wearing glasses that limit your vision to 10 feet) • Local maxima • Ridges (in cases when you can’t walk along the ridge) • Plateau • why is this a problem?

  30. Hill Climbing Gadgets • Variants on hill climbing play special roles • stochastic hill climbing • don’t always choose the best successor • first-choice hill climbing • pick the first good successor you find • useful if number of successors is large • random restart • follow steepest ascent from multiple starting states • probability of finding global max increases with number of starts

  31. Hill Climbing Usefulness • It Depends • Shape of state space greatly influences hill climbing • local maxima are the Achilles heel • what is cost of evaluation? • what is cost of finding a random starting location?

  32. Simulated Annealing • A term borrowed from metalworking • We want metal molecules to find a stable location relative to neighbors • heating causes metal molecules to jump around and to take on undesirable (high energy) locations • during cooling, molecules reduce their movement and settle into a more stable (low energy) position • annealing is process of heating metal and letting it cool slowly to lock in the stable locations of the molecules

  33. Simulated Annealing • “Be the Ball” • You have a wrinkled sheet of metal • Place a BB on the sheet and what happens? • BB rolls downhill • BB stops at bottom of hill (local or global min?) • BB momentum may carry it out of hill into another (local or global) • By shaking metal sheet, your are adding energy (heat) • How hard do you shake?

  34. Our Simulated Annealing Algorithm • “You’re not being the ball, Danny” (Caddy Shack) • Gravity is great because it tells the ball which way is downhill at all times • We don’t have gravity, so how do we find a successor state? • Randomness • AKA Monte Carlo • AKA Stochastic

  35. Algorithm Outline • Select some initial guess of evaluation function parameters: • Evaluate evaluation function, • Compute a random displacement, • The Monte Carlo event • Evaluate • If v’ < v; set new state, • Else set with Prob(E,T) • This is the Metropolis step • Repeat with updated state and temp

  36. Metropolis Step • We approximate nature’s alignment of molecules by allowing uphill transitions with some probability • Prob (in energy state E) ~ • Boltzmann Probability Distribution • Even when T is small, there is still a chance in high energy state • Prob (transferring from E1 to E2) = • Metropolis Step • if E2 < E1, prob () is greater than 1 • if E2 > E1, we may transfer to higher energy state • The rate at which T is decreased and the amount it is decreased is prescribed by an annealing schedule

  37. What have we got? • Always move downhill if possible • Sometimes go uphill • More likely at start when T is high • Optimality guaranteed with slow annealing schedule • No need for smooth search space • We do not need to know what nearby successor is • Can be discrete search space • Traveling salesman problem More info: Numerical Recipes in C (online) Chapter 10.9

  38. Local Beam Search • Keep more previous states in memory • Simulated Annealing just kept one previous state in memory • This search keeps k states in memory Generate k initial states if any state is a goal, terminate else, generate all successors and select best k repeat

  39. Isn’t this steepest ascent in parallel? • Information is shared between k search points • Each k state generates successors • Best k successors are selected • Some search points may contribute none to best successors • One search point may contribute all k successors • “Come over here, the grass is greener” (Russell and Norvig) • If executed in parallel, no search points would be terminated like this

  40. Beam Search • Premature termination of search paths? • Stochastic beam search • Instead of choosing best K successors • Choose k successors at random

More Related