Planning in Dynamic and Partially-unknown environments

Planning in Dynamic and Partially-unknown environments Maxim Likhachev University of Pennsylvania

Dynamic and Partially-known Environments • Planning in • partially-known environments is a repeated process • dynamic environments is also a repeated process ATRV navigating initially-unknown environment planning map and path University of Pennsylvania

Dynamic and Partially-known Environments • Planning in • partially-known environments is a repeated process • dynamic environments is also a repeated process planning in dynamic environments Tartanracing, CMU University of Pennsylvania

Dynamic and Partially-known Environments Other reasons for re-planning? • Planning in • partially-known environments is a repeated process • dynamic environments is also a repeated process planning in dynamic environments Tartanracing, CMU University of Pennsylvania

Dynamic and Partially-known Environments • Need to re-plan fast! • Two ways to help with this requirement • anytime planning – return the best plan possible within T msecs • incremental planning – reuse previous planning efforts planning in dynamic environments Tartanracing, CMU University of Pennsylvania

Dynamic and Partially-known Environments • Need to re-plan fast! • Two ways to help with this requirement • anytime planning • incremental planning this class: incremental version of A* and how it can be used for anytime and incremental planning University of Pennsylvania

Motivation for Incremental Version of A* cost of least-cost paths to sgoal initially • Reuse state values from previous searches cost of least-cost paths to sgoal after the door turns out to be closed University of Pennsylvania

Motivation for Incremental Version of A* cost of least-cost paths to sgoal initially • Reuse state values from previous searches These costs are optimal g-values if search is done backwards cost of least-cost paths to sgoal after the door turns out to be closed University of Pennsylvania

Motivation for Incremental Version of A* cost of least-cost paths to sgoal initially • Reuse state values from previous searches These costs are optimal g-values if search is done backwards Can we reuse these g-values from one search to another? – incremental A* cost of least-cost paths to sgoal after the door turns out to be closed University of Pennsylvania

Motivation for Incremental Version of A* cost of least-cost paths to sgoal initially • Reuse state values from previous searches Would # of changed g-values be very different for forward A*? cost of least-cost paths to sgoal after the door turns out to be closed University of Pennsylvania

Incremental Version of A* initial search by backwards A* initial search by D* Lite • Reuse state values from previous searches second search by D* Lite second search by backwards A* University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); insert s’ into OPEN; University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; v-value – the value of a state during its expansion (infinite if state was never expanded) ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); insert s’ into OPEN; University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > (s) + c(s,s’) g(s’) = (s) + c(s,s’); insert s’ into OPEN; g g • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > (s) + c(s,s’) g(s’) = (s) + c(s,s’); insert s’ into OPEN; g g Why? • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > (s) + c(s,s’) g(s’) = (s) + c(s,s’); insert s’ into OPEN; g g overconsistent state • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • OPEN: a set of stateswith v(s) > g(s) • all other states have v(s) = g(s) consistent state University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > (s) + c(s,s’) g(s’) = (s) + c(s,s’); insert s’ into OPEN; g g overconsistent state • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • OPEN: a set of stateswith v(s) > g(s) • all other states have v(s) = g(s) consistent state Why? University of Pennsylvania

A* with Reuse of State Values all v-values initially are infinite; ComputePath function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; • Alternative view of A* v(s)=g(s); for every successor s’ of s if g(s’) > (s) + c(s,s’) g(s’) = (s) + c(s,s’); insert s’ into OPEN; g g • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • OPEN: a set of stateswith v(s) > g(s) • all other states have v(s) = g(s) • this A* expands overconsistent states in the order of their f-values University of Pennsylvania

A* with Reuse of State Values initialize OPEN with all overconsistent states; ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+h(s)] from OPEN; insert s into CLOSED; all you need to do to make it reuse old values! • Making A* reuse old values: v(s)=g(s); for every successor s’ of s if g(s’) > (s) + c(s,s’) g(s’) = (s) + c(s,s’); insert s’ into OPEN; g g • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • OPEN: a set of stateswith v(s) > g(s) • all other states have v(s) = g(s) • this A* expands overconsistent states in the order of their f-values University of Pennsylvania

A* with Reuse of State Values g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v= h=0 2 S2 S1 1 2 Sstart 1 Sgoal 1 3 S4 S3 g=  v= h=1 g= 2 v= h=2 CLOSED = {} OPEN = {s4,sgoal} next state to expand: s4 • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • initially OPEN contains all overconsistent states University of Pennsylvania

A* with Reuse of State Values g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v= h=0 2 S2 S1 1 2 Sstart 1 Sgoal 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 CLOSED = {s4} OPEN = {s3,sgoal} next state to expand: sgoal University of Pennsylvania

A* with Reuse of State Values g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v=5 h=0 2 S2 S1 1 2 Sstart 1 Sgoal 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 CLOSED = {s4,sgoal} OPEN = {s3} done • after ComputePathwithReuse terminates: • all g-values of states are equal to final A* g-values University of Pennsylvania

A* with Reuse of State Values g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v=5 h=0 2 S2 S1 1 2 Sstart 1 Sgoal 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 we can now compute a least-cost path University of Pennsylvania

A* with Reuse of State Values initialize OPEN with all overconsistent states; ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+εh(s)] from OPEN; insert s into CLOSED; the exact same thing as with A* • Making weighted A* reuse old values: v(s)=g(s); just make sure no state is expanded multiple times for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; Why didn’t we do it for A*? University of Pennsylvania

Anytime Repairing A* (ARA*) [Likhachev, Gordon & Thrun ’03] set  to large value; g(sstart) = 0; v-values of all states are set to infinity; OPEN = {sstart}; while ≥ 1 CLOSED = {}; ComputePathwithReuse(); publish current  suboptimal solution; decrease ; initialize OPEN with all overconsistent states; • Efficient series of weighted A* searches with decreasing ε: University of Pennsylvania

ARA* set  to large value; g(sstart) = 0; v-values of all states are set to infinity; OPEN = {sstart}; while ≥ 1 CLOSED = {}; ComputePathwithReuse(); publish current  suboptimal solution; decrease ; initialize OPEN with all overconsistent states; • Efficient series of weighted A* searches with decreasing ε: need to keep track of those University of Pennsylvania

ARA* initialize OPEN with all overconsistent states; ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+εh(s)] from OPEN; insert s into CLOSED; • Efficient series of weighted A* searches with decreasing ε: v(s)=g(s); for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS Why? • OPEN U INCONS = all overconsistent states University of Pennsylvania

ARA* set  to large value; g(sstart) = 0; v-values of all states are set to infinity; OPEN = {sstart}; while ≥ 1 CLOSED = {}; INCONS = {}; ComputePathwithReuse(); publish current  suboptimal solution; decrease ; initialize OPEN= OPEN U INCONS; • Efficient series of weighted A* searches with decreasing ε: all overconsistent states (exactly what we need!) University of Pennsylvania

ARA* A series of weighted A* searches ARA* ε =2.5 ε =1.5 ε =1.0 ε =2.5 ε =1.5 ε =1.0 13 expansions solution=11 moves 1 expansion solution=11 moves 9 expansions solution=10 moves 13 expansions solution=11 moves 15 expansions solution=11 moves 20 expansions solution=10 moves University of Pennsylvania

A* with Reuse of State Values • So far, ComputePathwithReuse() could only deal with states whose v(s) ≥ g(s) (overconsistent or consistent) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v=5 h=0 2 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • So far, ComputePathwithReuse() could only deal with states whose v(s) ≥ g(s) (overconsistent or consistent) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) suppose the robot updates an edge cost g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v=5 h=0 4 2 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) need to update g(s1) g=1 v= 1 h=2 g= 3 v= 3 h=1 g=0 v=0 h=3 g= 5 v=5 h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) need to update g(s1) v(s1) < g(s1) g=1 v= 1 h=2 g= 3 v= 3 h=1 5 g=0 v=0 h=3 g= 5 v=5 h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  g=1 v= 1 h=2 g= 5 v= 3 h=1  g=0 v=0 h=3 g= 5 v=5 h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  • Makes s overconsistent or consistent v(s) ≥ g(s) g=1 v= 1 h=2 g= 5 v=  h=1 g=0 v=0 h=3 g= 5 v=5 h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  • Makes s overconsistent or consistent v(s) ≥ g(s) • Propagate the changes g=1 v= 1 h=2 g= 5 v=  h=1 g=0 v=0 h=3 g=  v=5 h=0 update g(sgoal) 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  • Makes s overconsistent or consistent v(s) ≥ g(s) • Propagate the changes • no more underconsistent states! g=1 v= 1 h=2 g= 5 v=  h=1 g=0 v=0 h=3 g=  v= h=0 4 S2 S1 fix sgoal 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v= h=1 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  • Makes s overconsistent or consistent v(s) ≥ g(s) • Propagate the changes • no more underconsistent states! g=1 v= 1 h=2 g= 5 v=  h=1 g=0 v=0 h=3 g= 6 v= h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v=5 h=1 expand s3 University of Pennsylvania

A* with Reuse of State Values • ComputePathwithReuse invariant: • g(s’) = mins’’ pred(s’) v(s’’) + c(s’’,s’) • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  • Makes s overconsistent or consistent v(s) ≥ g(s) • Propagate the changes • no more underconsistent states! g=1 v= 1 h=2 g= 5 v=  h=1 g=0 v=0 h=3 g= 6 v=6 h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 expand sgoal 3 S4 S3 g= 2 v=2 h=2 g= 5 v=5 h=1 University of Pennsylvania

A* with Reuse of State Values • after ComputePathwithReuse terminates: • all g-values of states are equal to final A* g-values • Edge cost increases may introduce underconsistent states (v(s) < g(s)) • Fix these by setting v(s) =  • Makes s overconsistent or consistent v(s) ≥ g(s) • Propagate the changes • we can backtrack an optimal path • (start at sgoal, proceed to pred that minimizes g+c) g=1 v= 1 h=2 g= 5 v=  h=1 g=0 v=0 h=3 g= 6 v=6 h=0 4 S2 S1 1 2 Sstart Sgoal 1 1 3 S4 S3 g= 2 v=2 h=2 g= 5 v=5 h=1 University of Pennsylvania

D* Lite [Koenig & Likhachev ’02] • Optimal re-planning algorithm • Simpler and with nicer theoretical properties version of D* [Stentz ’95] until goal is reached ComputePathwithReuse(); publish optimal path; follow the path until map is updated with new sensor information; update the corresponding edge costs; set sstart to the current state of the agent; //modified to fix underconsistent states University of Pennsylvania

D* Lite [Koenig & Likhachev ’02] • Optimal re-planning algorithm • Simpler and with nicer theoretical properties version of D* [Stentz ’95] until goal is reached ComputePathwithReuse(); publish optimal path; follow the path until map is updated with new sensor information; update the corresponding edge costs; set sstart to the current state of the agent; //modified to fix underconsistent states • Important detail! search is done backwards: • search starts at sgoal, and searhes towards sstart with all edges reversed • This way, root of the search tree remains the same and g-values are more likely to remain • the same in between two calls to ComputePathwithReuse why? why care? University of Pennsylvania

D* Lite • D* & D* Lite vs. A* initial A* search initial D* Lite search second A* search second D* Lite search University of Pennsylvania

Anytime and Incremental Planning • Anytime D* [Likhachev et al. ’05]: • decrease  and update edge costs at the same time • re-compute a path by reusing previous state-values set  to large value; until goal is reached ComputePathwithReuse(); publish  -suboptimal path; follow the path until map is updated with new sensor information; update the corresponding edge costs; set sstart to the current state of the agent; if significant changes were observed increase  or replan from scratch; else decrease ; //modified to fix underconsistent states What for? University of Pennsylvania

Anytime and Incremental Planning Image of GM Urban Challenge Vehicle • Anytime D* in Urban Challenge Tartanracing, CMU huge parking lot movie planner movie University of Pennsylvania

Other Uses of Incremental A* • Whenever planning is a repeated process: • improving a solution (e.g., in anytime planning) • re-planning in dynamic and previously unknown environments • adaptive discretization • many other planning problems can be solved via iterative planning University of Pennsylvania

Planning in Dynamic and Partially-unknown environments