Capítulo 9 Heurísticas em Planejamento

Planejamento em Inteligência Artificial Capítulo 9 Heurísticas em Planejamento Leliane Nunes de Barros MAC5788 IME-USP 2005

Planejamento como Busca Não-determinística:uma visão conceitual O nó urepresenta um conjunto de planos soluções u : conjunto de todas as soluções alcançáveis de u.

Planejadores como instâncias de Abstract-search (I) • Planejamento no espaço de estados: u é uma seqüência de ações. Toda solução alcançável apartir de u contém essa seqüência como prefixo ou sufixo. • Planejamento no espaço de planos: u é um conjunto de ações com vínculos causais, restrições de ordenação e restrições de unificações. Toda solução alcançável apartir de u contém todas as ações e satisfazem as restrições • Algoritmo Graphplan: u é um sub-grafo de um grafo de planejamento, isto é, uma seqüência de conjuntos de ações junto com restrições de pré-condições, efeitos e exclusões mútuas. Toda solução alcançável apartir de u contém as ações em u correspondente aos níveis já resolvidos (da busca no grafo) e pelo menos uma ação de cada nível qua ainda não foi resolvido em u.

Planejadores como instâncias de Abstract-search (II) • Planejamento basead em SAT: u é um conjunto de literais valorados e o restante das cláusulas, cada uma sendo uma disjunção de literais que descrevem ações e estados. Toda solução alcançável apartir de u corresponde a uma atribuição de valores verdade ou falso aos literais não valorados tal que todas as cláusulas restantes sejam satisfeitas. • Planejamento baseado em CSP: u é um conjunto de variáveis CSP e restrições com algumas variáveis com valores já atribuidos. Toda solução alcançável apartir de u inclue essas variáveis valoradas e atribuições para as outras variáveis CSP que satisfazerm as restrições.

Planejadores como instâncias de Abstract-search (III) • Planejamento no espaço de estados: u é um plano parcial • Planejamento no espaço de planos:u é um plano parcial • Algoritmo Graphplan: nem todas as ações em u farão parte do plano solução • Planejamento basead em SAT: nem todas as ações em u farão parte do plano solução • Planejamento baseado em CSP: nem todas as ações em u farão parte do plano solução

Planejadores como instâncias de Abstract-search (III) • Planejamento no espaço de estados: u é um plano parcial • Planejamento no espaço de planos:u é um plano parcial • Algoritmo Graphplan: nem todas as ações em u farão parte do plano solução • Planejamento basead em SAT: nem todas as ações em u farão parte do plano solução • Planejamento baseado em CSP: nem todas as ações em u farão parte do plano solução Disjunctiverefinement approaches:

Planejadores como instâncias de Abstract-search (IV) • Refine: modifica a coleção de ações e/ou restrições associadas com um nó u • Branch: gera um ou mais filhos de u que serão os nós candidatos para ser o próximo nó visitado. Cada filho v representa um sub-conjunto de soluções v  u • Prune: remove do conjunto de nós candidatos {u1,u2, ... ,uk} alguns nós que parecem ser não promissores para a busca (por exemplo, um nó já visitado que não levou a solução) • Planejadores podem: • Variar a ordem em que esses 3 procedimentos são executados • Usar diferentes mecanismos de controle, como IDS ou A* • Como os planejadores que estudamos realizam esses procedimentos (páginas 195 à 197)

Tornando Abstract-search Determinística

Heurística de seleção de nós: resolve o problema da escolha não-determinística • Suponha que para cada estado s, temos uma estimativa h(s) da distância de s para uma solução • h(s) é usada para decidir qual será o próximo nó a ser expandido • quanto mais informativa for h(s) menor será o número de escolhas erradas • h(s) deve ainda ser facilmente computável • Heurísticas são geralmente baseadas mo princípio de relaxação de problemas: para avaliar o quão desejável é um nó consideramos um problema mais simples obtido apartir do original fazendo suposições de simplicação e relaxando restrições

Heurística de seleção de nós • Suppose we’re searching a tree in which each edge (s,s') has a cost c(s,s') • If p is a path, let c(p) = sum of the edge costs • For classical planning, this is the length of p • For every state s, let • g(s) = cost of the path from s0 to s • h*(s) = least cost of all paths from s to goal nodes • f*(s) = g(s) + h*(s) = least cost of all paths • from s0 to goal nodes that go through s • Suppose h(s) is an estimate of h*(s) • Let f(s) = g(s) + h(s) • f(s) is an estimate of f*(s) • h is admissible if for every state s, 0 ≤ h(s) ≤ h*(s) • If h is admissible then f is a lower bound on f*

The A* Algorithm • A* on trees: loop choose the leaf node s such that f(s) is smallest if s is a solution then return it and exit expand it (generate its children) • On graphs, A* is more complicated: additional machinery to deal with multiple paths to the same node • If a solution exists (and certain other conditions are satisfied), then: • If h(s) is admissible, then A* is guaranteed to find an optimal solution • The more “informative” the heuristic is (i.e., the closer it is to h*), the smaller the number of nodes A* expands • If h(s) is within c of being admissible, then A* is guaranteed to find a solution that’s within c of optimal

Heuristic Functions for Planning • *(s,p): minimum distance from state s to a state containing p • *(s,s’): minimum distance from state s to a state containing every p in s’ • i(s,p) andi(s,s’), where i = 0, 1, 2, … • estimates of*(s,p) and*(s,s’) and p  s, • h(s) = 0(s,g), where g is the goal 0 ignora os efeitos negativos das ações

Heurística da relaxação de independência (polinomial no número de proposições e ações) • Given s, can compute 0(s,p), for every proposition p: • From this, can compute h(s) = 0(s,g) = p  g 0(s,p)

Heuristic Forward Search • Note: this is depth-first search; thus admissibility is irrelevant

Heuristic Backward Search • Same comment as on the previous slide

and p  s, • h(s) = 0(s,g) is an inadmissible heuristic • 1: like 0 except that 1(s,g) = maxp  g 0(s,p) • This heuristic is admissible; thus it could be used with A* • It is not very informative

A More Informed Heuristic • Instead of computing the maximum distance to each p in g, compute the maximum distance to each pair {p,q} in g:

More Generally, …

Complexity of Computing the Heuristic • Takes time (nk). • If k = max(|g|, max{|precond(a)| : a is an action}) then computing (s,g) is as hard as solving the entire planning problem

Get Heuristic Values froma Planning Graph • The planning graph provides not only the distance estimates from s0 to every reachable proposition p but also the mutex relations Pi • The Solution extraction procedure selects a set of propositions g in a layer only if no pair of elements in g is mutex. Theorem: if a pair of elements in not mutex in a layer, then it remains nonmutex in the following layers  this is very close to 2(s,g)

Get Heuristic Values froma Planning Graph • Recall how GraphPlan works: loop Graph expansion: extend a “planning graph” forward from the initial stateuntil we have achieved a necessary (but insufficient) condition for plan existence Solution extraction: search backward from the goal, looking for a correct plan if we find one, then return it repeat this takes polynomial time this takes exponential time

Using the Planning Graphto Compute h(s) • In the graph, there are alternatinglayers of ground literals and actions • The number of “action” layersis a lower bound on the numberof actions in the plan • Construct a planning graph,starting at s • g(s,p) = level of the first layer that “possibly achieves” p • g(s,g) is very close to 2(s,g) • 2(s,g) counts each action individually • g(s,g) groups together the independent actions in a layer

The FastForward Planner • Use a heuristic function similar to h(s) = g(s,g) • Someways to improve it (I’ll skip the details) • Don’t want an A*-style search (takes too much memory) • Instead, use a greedy procedure: until we have a solution, do expand the current state s s := the child of s for which h(s) is smallest (i.e., the child we think is closest to a solution) • There are some ways to improve this (I’ll skip the details) • Can’t guarantee how fast it will find a solution,or how good a solution it will find • However, it works pretty well on many problems

FastForward in the AIPS-2000Planning Competition • FastForward did quite well • In the AIPS-2000 competition, all of the planning problems were classical problems • Two tracks: • Fully automated planners • FastForward participated in the fully automated track • It got an “outstanding performance” award • Large variance in how close its plans were to optimal • However, it found them very fast compared with the other automated classical planners

FastForward in the AIPS-2002Planning Competition • Among the automated planners, FastForward was roughly in the middle • LPG (graphplan + local search) did much better, and got a “distinguished performance of the first order” award • One of thing that caused difficulty for FastForward • The problems in the AIPS-2002 went beyond classical planning • Numbers, optimization, time durations • Example: • A domain inspired by the Hubble space telescope(a lot simpler than the real domain, of course) • A satellite needs to take observations of stars • Gather as much data as possiblebefore running out of fuel • Any amount of data gathered is a solution • Thus, FastForward always returned the null plan

Heuristics for Plan-Space Planning • How to select the next flaw to work on?

One Possible Heuristic • Fewest Alternatives First (FAF)

Do Others Work Better? • Sometimes yes, sometimes no • Limits to how good any flaw-selection heuristic can do

Serializing and AND/OR Tree • The search space isan AND/OR tree • Deciding what flaw to work on next = serializing this tree (turning it into a state-space tree) • at each AND branch,choose a child toexpand next, anddelay expandingthe other children Partial plan p Variable Goal Goal g1 Goal g2 … … constraint ordering … Operator o1 Operator on Partial plan p Goal g1 Operator o1 Operator on … Partial plan p1 Partial plan pn Goal g Variable Order Goal g Variable Order … … … … 2 2 constraint tasks constraint tasks

One Serialization

Another Serialization

Why Does This Matter? • Different refinement strategies produce different serializations • the search spaces have different numbers of nodes • In the worst case, the planner will search the entire serialized search space • The smaller the serialization, the more likely that the planner will be efficient • One pretty good heuristic: fewest alternatives first (FAF heuristic) => choose the flaw having the smallest number of resolvers. This will limit the cost of eventual backtracks. Takes O(n) where n is the number of flaws in a partial plan.

How Much Difference Can the Refinement Strategy Make? • Case study: build an AND/OR graph from repeated occurrences of this pattern: • Example: • number of levels k = 3 • branching factor b = 2 • Analysis: • Total number of nodes in the AND/OR graph is n = Q(bk) • How many nodes in the best and worst serializations?

Case Study, Continued • The best serialization contains Q(b2k) nodes • The worst serialization contains Q(2kb2k) nodes • The size differs by an exponential factor, but the best serialization still is exponentially large • To do better, need good node selection, branching, pruning

Resolver selection heuristics • Let  be the set of all plans generated with all the possible reasolvers. Select a plan  that has the smallest set of sug-goal whithout causal links (not very informed heuristic). • A more informative heuristic: construct an AND/OR graph along regression steps defined by -1 down to some fixed level k • Select the plan  that minimizes the wheighted sum of: • The number of actions in this graph that are not in  and • The number of subgoals remaining in its leaves that are not in the initial state

Capítulo 9 Heurísticas em Planejamento