250 likes | 472 Views
Hill-Climbing Search and Dynamic Programming:. Heuristics. Motivations. If a problem has polynomial-time solutions, traditional programming techniques are used to implement the solutions. AI problems, however, are exponential in nature, e.g., chess, traveling salesman.
E N D
Motivations • If a problem has polynomial-time solutions, traditional programming techniques are used to implement the solutions. • AI problems, however, are exponential in nature, e.g., chess, traveling salesman. • Breadth-first and depth first are exponential answers to these problems. • They take too much time to find the optimal (best) solution, except when the problem size is small. • How to find a reasonable solution in reasonable time for these intractable problems? • Use heuristics or simple rules of thumb.
Objectives • What is heuristic? • Heuristic measure for tic-tac-toe • Heuristicsearch strategy: hill-climbing • Dynamic programming • Edit distance algorithm
10 opening rules from Exeter Chess Club • Get your pieces out into the centre quickly. • Race to control the center. • Move minor pieces out first, not your Q or R's which can be attacked and lose time. • Get a firm foothold in the centre and don't give it up. • Move knights straightaway toward the center. • Move your king to safety by castling at the King's side (which also gets your rook into play). • Complete your development before moving a piece twice or starting an attack. • Keep your queen safe. • Don't grab pawns or attack if you haven't completed development.
What are heuristics? • Informed guesses • Advice • Rules of thumb • Probably and sometimes true statements • Sometimes they are just quick and dirty tricks • Sometimes they failed. Other times they allow you to find suboptimal solutions. • For AI problems, heuristics must be implementable on a computer. • If the advice does not provide enough details to be implementable, then the advice is not called a heuristic.
Why heuristic? • Formal logical reasoning is not enough for the real world • intuitive reasoning • Computational cost prohibitive • traveling salesman • play the perfect chess game
State space:Brute-force approach 1st ply, 9 choices 2nd ply, 8 choices Each path from root to leaf is a possible game. At most 9! possible games. 9!=362880 O(2n) Combinatorial! not just astronomical
State space reduced by symmetry Symmetry operations: rotate and/or flip
Heuristically reduced state space Strongest opening move is the center. winning opportunities for x 2 possible replies for o after symmetry is considered Use same heuristic until game over.
Hill-climbing strategy • There are many heuristics for search. • One of the simplest is called hill climbing, climbing a mountain under a thick fog. • Gradient ascent: Go uphill along the steepest possible path until we can go no farther. • The general strategy is to expand the current state and select the best child for further expansion. • Halt the search when it reaches a state that is better than any of its children. • It is a greedy and short sighted algorithm. • It keeps no history. It does not remember where it comes from. • It only sees the current node and its children. It does not look ahead to the grandchildren in the state space.
G Problems with hill climbing • Major Problem: tendency to become stuck at local maxima • If the algorithm reach a local maximum, the algorithm halts. • Sacrifice a piece for a force checkmate later. • 8-puzzle • In order to move a particular tile to its destination, other tiles that are already in goal position have to be moved. This move may temporarily worsen the board state.
Dynamic Programming • An example: Fibonacci sequence • F(0) = 1; F(1)=1; F(n)=F(n-1)+F(n-2) • Keep track of the computation F(n-1) and F(n-2), and reuse their results to compute F(n) • Compared with recursion, DP is more efficient • Divide-and-conquer: Divided problems into multiple interacting and related subproblems • Reuse subproblem solutions to get the solution for the larger problem. • Applications: • String matching • Spell checking • Nature language processing and understanding
Spelling correction • Given a dictionary and a string X, return the words in the dictionary closest to X • What is the meaning of “closest”? • Edit distance • Weighted edit distance Google search: Akureyr Google returns Did you mean: Akureyri
Sample GenBank Record Saccharomyces cerevisiae TCP1-beta gene 1 gatcctccat atacaacggt atctccacct caggtttaga tctcaacaac gaaccattg 61 ccgacatgag acagttaggt atcgtcgaga gttacaagct aaaacgagca tagtcagct 121 ctgcatctga agccgctgaa gttctactaa gggtggataa catcatccgt caagaccaa 181 gaaccgccaa tagacaacat atgtaacata tttaggatat acctcgaaaa aataaaccg 241 ccacactgtc attattataa ttagaaacag aacgcaaaaa ttatccacta ataattcaa . . . 4921 ttttcagtgt tagattgctc taattctttg agctgttctc tcagctcctc atatttttct 4981 tgccatgact cagattctaa ttttaagcta ttcaatttct ctttgatc Base counts: 1510 a’s, 1074 c’s, 835 g’s, and 1609 t’s
Two types of alignment • Local alignment(Common substring) • Global alignment(Longest Common Subsequence) • AATTGGAC • |||| • ACATGGAT • AATTGGAC • | |||| • ACATGGAT
Pairwise comparison • Are AATTCGA and ACATCGG similar? • Computes the degree of similarity of two given sequences X and Y to each other. • How do we define similarity ? • Similar if one sequence can be obtained from the other with small number of mutations.
Edit operations Insert, cost 1 Delete, cost 1 Replace, cost 2 Edit distance: minimum number of edit operations AATTGGAC | |||| ACATGGAT A-ATTGGAC | || ||| ACAT-GGAT Edit distance
3 ways to look at edit distance distance("intention", "execution") = 8
ie nx t e=e c nu t=t i=i o=o n=n Edit distance matrix
Edit distance algorithm • Inputs • X = AATTGGA... m letters • Y = ACATGGA... n letters • Initialization • D[i,0]=i • D[0,j]=j • Loop • D[i,j]=min { D[i-1,j-1] + sub(x[i],y[j]), D[i-1,j] + del(x[i]), D[i,j-1] + ins(y[j]) } • Output: D[m,n] • Levenshtein distance: • sub(x,y)=1 iff x != y • del=1 • Ins=1
Summary • Greedy heuristics is extremely fast but often they do not give the optimal solutions. • They get stuck in local maximum. • Dynamic programming solves a problem by • solving the smaller partial problems • storing the partial results in a data structure, e.g., a matrix, instead of relying of recursions • building up the next-size solution from these smaller solutions • until the complete solution is reached.