1 / 28

Heuristics

Heuristics. CPSC 386 Artificial Intelligence Ellen Walker Hiram College. Informed Search Strategies. Also called heuristic search All are variations of best-first search The next node to expand is the one “most likely” to lead to a solution

paul
Download Presentation

Heuristics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Heuristics CPSC 386 Artificial Intelligence Ellen Walker Hiram College

  2. Informed Search Strategies • Also called heuristic search • All are variations of best-first search • The next node to expand is the one “most likely” to lead to a solution • Priority queue, like uniform cost search, but priority is based on additional knowledge of the problem • The priority function for the priority queue is usually called f(n)

  3. Heuristic Function • Heuristic, from Greek for “good” • Heuristic function, h(n) = estimated cost from the current state to the goal • Therefore, our best estimate of total path cost is g(n) + h(n) • Recall, g(n) is cost from initial state to current state

  4. In A*, better h means better search • When h = cost to the goal, • Only nodes on correct path are expanded • Optimal solution is found • When h < cost to the goal, • Additional nodes are expanded • Optimal solution is found • When h > cost to the goal • Optimal solution can be overlooked

  5. Pruning the Search Tree • In A* search, if h is too big, it will prevent the node (and its successors, grand-successors, etc.) from ever being expanded • This is called “pruning” (like removing branches from a tree) • Pruning the tree reduces the search below exponential • Only if a good heuristic is available

  6. Costs of A* • Time • The better the heuristic, the less time • Best case: h is perfect, O(d) • Worst admissible case: h is 0, O(bd), i.e. bfs • Space • All nodes (open and closed list) are saved in case of repetition • This is exponential (bd or worse). • A* generally runs out of space before it runs out of time

  7. Memory-bounded Heuristic Search • Iterative Deepening A* (IDA*) • Like iterative deepening, but cutoff at (g+h)>max, rather than depth >max • At each iteration, cutoff is first f-cost that exceeds the cost of the node at the previous iteration. • Recursive BFS (see textbook, fig 4.5) • Simple Memory Bounded A* (SMA*) • Set max memory bound • If memory is “full”, to add a node drop the worst (g+h) node that’s already stored • Expands newest best leaf, deletes oldest worst leaf

  8. Backed-up Values • The (real) f-value of any node in a path is the same as the f-value of the solution • Therefore, you can update f of parent to best f of a child. (This also helps when revisiting a node from a different parent) • If you have to “forget” deeper nodes, their consequences are remembered in the parent • (This concept is used more prominently in adversary games)

  9. Comparing Heuristic Functions • An admissible heuristic function never overestimates the distance to the goal. • The function h=0 is the least useful admissible function. • Given 2 admissible heuristic functions (h1 and h2), h1 dominates h2 if h1(n)≥ h2(n) for any node n • The perfect h function is dominant over all other admissible heuristic functions • Dominant admissible heuristic functions are better

  10. Combining Heuristic Functions • Every admissible heuristic is <= the actual distance to goal • Therefore, if you have 2 admissible heuristics, the higher value is closer to the goal. • If you have 2 or more heuristics, you can therefore combine them into a better one by taking the maximum value for any state. • Useful when you have a set of heuristics where no one is dominant

  11. Finding Heuristic Functions: Relaxed Problems • Remove constraints from the original problem to generate a “relaxed problem” • Cost of optimal solution to relaxed problem is admissable heuristic for original problem • Because a solution to the original problem also solves the relaxed problem (at a cost ≥ relaxed solution cost)

  12. 8-puzzle examples • Number of tiles out of place • Relax constraint that tiles must move into empty squares, and that tiles must move into adjacent squares • Manhattan distance to solution • Relax (only) constraint that tiles must move into empty squares

  13. Finding Heuristic Functions: Subproblems • Consider solving only part of the problem • Example: getting 1,2,3 and 4 of 8-puzzle into place • Again, exact solutions to subproblems are admissable heuristics • Store subproblem solutions in a pattern database, look up heuristic • # patterns is much smaller than state space! • Generate database by working backwards from the solution • If multiple subproblems apply, take the max • If multiple disjoint subproblems apply, heuristics can be added

  14. Finding Heuristic Functions: Learning • Take experience and learn a function • Each “experience” is a start state and the actual cost of the solution • Learn from “features” of a state that are relevant to a solution, rather than the state itself (helps generalization) • Generate “many” states with a given feature and determine average distance • Combine information from multiple features • h(n) = c1 * x1(n) + c2 * x2(n)… where x1, x2 are features

  15. Local Search Algorithms • Instead of considering the whole state space, consider only the current state • Limits necessary memory; paths not retained • Amenable to large or continuous (infinite) state spaces where exhaustive algorithms aren’t possible • Local search algorithms can’t backtrack!

  16. Optimization • Given measure of goodness (of fit) • Find optimal parameters (e.g correspondences) • That maximize goodness measure (or minimize badness measure) • Optimization techniques • Direct (closed-form) • Search (generate-test) • Heuristic search (e.g Hill Climbing) • Genetic Algorithm

  17. Direct Optimization • The slope of a function at the maximum or minimum is 0 • Function is neither growing nor shrinking • True at global, but also local extreme points • Find where the slope is zero and you find extrema! • (If you have the equation, use calculus (first derivative=0) but watch out for “shoulders”

  18. Hill Climbing • Consider all possible successors as “one step” from the current state on the landscape. • At each iteration, go to • The best successor (steepest ascent) • Any uphill move (first choice) • Any uphill move but steeper is more probable (stochastic) • All variations get stuck at local maxima

  19. Issues in Hill Climbing • Local maxima = no uphill step • Algorithms on previous slide fail (not complete) • Allow “random restart” which is complete, but might take a very long time • Plateau = all steps equal (flat or shoulder) • Must move to equal state to make progress, but no indication of the correct direction • Ridge = narrow path of maxima, but might have to go down to go up (e.g. diagonal ridge in 4-direction space)

  20. Simulated Annealing • Figure 4.14, simulate gradual cooling to low-energy crystalline state • Algorithm is randomized: take a step if random number is less than a value based on both the objective function and the Temperature. • When Temperature is high, chance of going toward a higher value of optimization function J(x) is greater. • Note higher dimension: “perturb parameter vector” vs. “look at next and previous value”.

  21. Local Beam Search • Keep track of K local searches at once • At each step, generate all successors and keep the best K • (Localized version of memory-bounded A*) • Stochastic: choose K states at random, but probability of state being chosen is proportional to its goodness

  22. Genetic Algorithm • Quicker but randomized searching for an optimal parameter vector • Operations • Crossover (2 parents -> 2 children) • Mutation (one “bit”) • Basic structure • Create population • Perform crossover & mutation (on fittest) • Keep only fittest children

  23. Example: “Hello, World” • Initial population is 2048 random strings of length 12 • Fitness of an individual is calculated by comparing each letter to its corresponding letter in the target phrase and adding up the differences • Top 10% of population is retained, remaining 90% is created by crossover of top 50% of population with 25% chance of mutation • Crossover: choose a random position and swap substrings • Mutation: choose a random position and replace by a random character Source: http://generation5.org/content/2003/gahelloworld.asp

  24. Crossover and Mutation • Crossover • Parents: “Habxcq, oorld” and “Yellav,adjfd” • Children: “Hablav, adjfd” and “Yelxcq, oorld” • Mutation • Before: “Habxcq, oorld” • After: “Habxrq, oorld”

  25. Genetic Algorithm: Why does it work? • Children carry parts of their parents’ data • Only “good” parents can reproduce • Children are at least as “good” as parents? • No, but “worse” children don’t last long • Large population allows many “current points” in search • Can consider several regions (watersheds) at once

  26. Genetic Algorithm: Issues & Pitfalls • Representation • Children (after crossover) should be similar to parent, not random • Binary representation of numbers isn’t good - what happens when you crossover in the middle of a number? • Need “reasonable” breakpoints for crossover (e.g. between R, xcenter and ycenter but not within them) • “Cover” • Population should be large enough to “cover” the range of possibilities • Information shouldn’t be lost too soon • Mutation helps with this issue

  27. Experimenting With Genetic Algorithms • Be sure you have a reasonable “goodness” criterion • Choose a good representation (including methods for crossover and mutation) • Generate a sufficiently random, large enough population • Run the algorithm “long enough” • Find the “winners” among the population • Variations: multiple populations, keeping vs. not keeping parents, “immigration / emigration”, mutation rate, etc.

  28. Summary: Search Techniques • Exhaustive • Depth-first, Breadth First • Uniform cost • Iterative Deepening • Best-first (heuristic) • Greedy • A* • Memory-bounded (beam, mbA*) • Local heuristic • Hill-climbing (steepest, any upward, random restart) • Simulated annealing (stochastic) • Genetic Algorithm (highly parallel, stochastic)

More Related