300 likes | 344 Views
Algorithms for Path-Planning. Shuchi Chawla Carnegie Mellon University. A trick-or-treat problem. It’s Halloween… Collect as much candy as possible between 6pm and 8pm Goal: Get as much candy as possible In what order should you visit houses?. Path-planning.
E N D
Algorithms for Path-Planning Shuchi Chawla Carnegie Mellon University
A trick-or-treat problem • It’s Halloween… Collect as much candy as possible between 6pm and 8pm • Goal: Get as much candy as possible • In what order should you visit houses? Shuchi Chawla
Path-planning • Informally… planning and ordering of tasks • Classic instance ― TravelingSalesmanProblem Find the shortest tour covering all given locations • A natural extension ― Orienteering Cover as many locations as possible by a given deadline • Manyvariants, applications: Shuchi Chawla
Pick up and delivery schedule, Friday, Jan 17 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00 Up the elevator Office; Pick packages A & B Office; Pick packages C & D Package C, Downtown, *priority* Package B, Monroeville Package D, Waterfront Package A, Robinson Up the stairs Path-planning • Informally… planning and ordering of tasks • Classic instance ― TravelingSalesmanProblem • A natural extension ― Orienteering • Many variants, applications: • Delivery & distribution problems • Production planning, assembly analysis • Robot navigation • Needs to be solved at large scales in practice • Mostly NP-hard☹ Cannot find the optimal solution in “polynomial-time” Item X1: Value: $15 Deadline: 100 units by March 15 Requirements: 5 hrs. on Y1 7 hrs. on Y2 Scheduling constraints: < 1 hr. gap between Y1 and Y2 Machine Y2: Cost: $2 per hr. Scheduling constraints: > 30 mins. between X1 and X2 Shuchi Chawla
Path-planning: A brief history • Studied in Operations Research for the past 20-30 years (also known as Vehicle Routing) • Many variants: • multiple vehicles, stochastic demands, pickup-and-delivery • Techniques for Mixed Integer Programming • Stochastic planning problems in Robotics and AI • My focus: Approximation algorithms A 2-approximation: half-as-good as OPT; value ½ OPT Find a solution in polynomial-time with value OPT (a1) 1∕ • cutting plane methods branch and bound • simulated annealing genetic algorithms … • Focus mostly on exploration, position estimation • Some poly-time solvable problems studied e.g. shortest paths • Linear and Dynamic programming techniques Shuchi Chawla
Reward obtained Time taken Approximation Results • A reward vs. time trade-off A “quota” approximation OPT (2 hrs, ) (4 hrs, ) A “budget” approximation (2 hrs, ) Shuchi Chawla
Approximation Results • A reward vs. time trade-off • A budget on time; maximize reward • Orienteering • Deadline-TSP • TSP with Time-Windows • A quota on reward; minimize time • TSP • k-TSP • Min-Excess • Optimize a combination of reward and time • Prize-Collecting TSP • Discounted-Reward TSP • single deadline on time • different deadlines on different locations • different time windows for diff. locations • visit all locations • visit k locations • visit k locations, but minimize excess • minimize time plus reward foregone • max. reward, reward decreases with time Shuchi Chawla
Approximation Results • A reward vs. time trade-off • A budget on time; maximize reward • Orienteering • Deadline-TSP • TSP with Time-Windows • A quota on reward; minimize time • TSP • k-TSP • Min-Excess • Optimize a combination of reward and time • Prize-Collecting TSP • Discounted-Reward TSP This talk • ? • ? • ? • 3 • 3 log n • 3 log2 n FOCS’03:Blum, Karger, Chawla, Lane, Meyerson, Minkoff STOC’04: Bansal, Blum, Chawla, Meyerson Use structural properties and Dynamic Programming Use LP-rounding • 1.5 [Christofides ’76] • 2+[BRV99] [Garg99] [AK00] [CGRT03] … • 2+ • 1.5 [Christofides ’76] • 2+[BRV99] [Garg99] [AK00] [CGRT03] … • ? • 2 [Goemans Williamson ’92] • ? • 2 [Goemans Williamson ’92] • 6.75+ Shuchi Chawla
A road-map • An approximation for Orienteering • Current and future work • Robot navigation & path-planning in a stochastic world • Other research Shuchi Chawla
Back to candy-collection (Orienteering) • The givens A “map” G of locations, distances, start location s Rewards on locations Deadline D • To find A path that collects as much reward as possible by deadline D • NP-hard; we want to find an approximation In poly-time, find a path covering reward ⅓OPT by deadline D Shuchi Chawla
A flawed attempt: the greedy approach • Visit the closest location; continue until time over • Assume for simplicity – all locations have equal reward • Local decisions are bad – consider the “big picture” • treat clusters of locations as single entities with large reward Greedy (2 hrs, ) Optimal (2 hrs, ) Shuchi Chawla
A flawed attempt: the greedy approach • A single mistake – the very first step – could be disastrous! • Suppose we had twice or thrice the time … • Can we use this to approximate Orienteering? Greedy (2 hrs, ) Algorithm using twice the optimal length (4 hrs, ) Optimal (2 hrs, ) Recall: 2-approx to k-TSP … We know how to achieve this! Shuchi Chawla
k-TSP: Visit at least k reward as fast as possible Several approx. known best: 2-approx [Garg05] Orienteering: Visit as much reward as possible by time D No approximation known k-TSP versus Orienteering Equivalent at optimality; different in approximation: OPT k-TSP approx Orienteering approx Reward obtained 2 hrs. 4 hrs. Time taken Shuchi Chawla
Next attempt: Using k-TSP for Orienteering • If there exists a path: length = , reward • 2-approx to k-TSP gives: length , reward • Is there a good path with these properties? • Bad trade-off between length and reward! • Delving deeper does not help – • algorithms for k-TSP use a “primal-dual” subroutine [Goemans Williamson’92] • inherently aim for the low-hanging fruit first ½D OPT 1∕ D OPT 1∕ OPT k-TSP approx ALG OPT’ OPT’ k-TSP approx Orienteering approx Reward obtained OPT 1 hr. 2 hrs. 4 hrs. Time taken Shuchi Chawla
Next attempt: Using k-TSP for Orienteering • If there exists a path: length = , reward • 2-approx to k-TSP gives: length , reward • Is there a good path with these properties? • Yes, if we didn’t have to start from • Idea: Approximate one half of OPT ½D OPT 1∕ D OPT 1∕ • Need a better approximation than k-TSP Use only as much extra length as “saved” in the first part • Problem: If OPT collects most reward in its second half… • Length of path ≫D Shuchi Chawla
Key insight: Approximate “excess” • Problem: If OPT collects most reward in its second half? • Length of path ≫D • Need a better approximation than k-TSP • Excess of path = length of path – length of shortest path • Given a 2-approx for “excess”: • Divide OPT into 2 parts, each with equal excess • Approximate “excess” on the part with maximum reward Shuchi Chawla
s Shortest distance from start Length of path Approximating the excess • The Min-Excess problem: Given a map G, start s, end t, reward quota k Find path with reward k that minimizes excess = length – shortest distance from s to t t dist(s,t) length OPT excess ℓ length = shortest distance Min-Excess approx k-TSP approx Shuchi Chawla
s Shortest distance from start dist(s,t) length OPT excess Length of path Min-Excess approx k-TSP approx Approximating the excess: large excess • The easy case: dist(s,t) ≪ excess k-TSP gives a good approximation to Min-Excess t t OPT k-TSP approx Min-Excess approx ℓ Shuchi Chawla
s Shortest distance from start Length of path Approximating the excess: small excess • The hard case: dist(s,t) > excess, or, excess ≈ 0 OPT visits locations roughly in order of increasing distance from start • Key insight: If OPT visits locations in order of increasing distance, we can find it exactly using Dynamic Programming! t t OPT k-TSP approx v u ℓ Min-Excess approx k=5 ℓ k=4 u v For all locations v, and reward k: find and store best path from start to v with reward k To find best path for (v,k), – try all paths that end at some u and collect reward k-1 – Pick the one with smallest excess Shuchi Chawla
The large excess case: excess ≫ shortest dist Use k-MST The small excess case: excess ≈ 0 (OPT is “monotone”) Use Dynamic Programming Dynamic Program Use k-TSP Approximating the excess: combining them Gives a (2+)-approximation for Min-Excess [FOCS’03] What about the intermediate case? Patch segments using dynamic programming OPT wiggly wiggly monotone monotone monotone In order of increasing distance from start Shuchi Chawla
3 1 2 An algorithm for Orienteering • Construct a path from s to t, that has length D and collects maximum possible reward • Given a 3-approximation to min-excess: 1. Divide into 3 “equal-reward” parts (hypothetically) 2. Approximate the part with the smallest excess 3-approximation to orienteering An r-approximation for min-excess gives a r-approximation for Orienteering Excess of path from u to v (u,v) = ℓ(u,v)–d(u,v) Open: Given an r-approx for min-excess, can we get r-approx to Orienteering? OPT v2 v1 ALG Excess of ALG 1+2+3; Reward of ALG = ⅓ reward of OPT Shuchi Chawla
Open problems • Better approximations • Vehicle capacities • Precedence constraints • Faster approximations • How hard are they? • Hardness of approximation results for path-planning • Only small-constant-factor hardness known • Stochastic models • Map changes over time • Requests arrive stochastically • Robot Navigation Shuchi Chawla
An application: Robot Navigation • Robot’s task: deliver packages to various locations, collect samples, etc. • Planning with uncertainty • may run out of battery power • may crash into a wall • may face unforeseen obstacle that causes delay • Goal: perform as many tasks as possible before failure occurs; perform all tasks as fast as possible in expectation Shuchi Chawla
reward time A simple model for uncertainty • At each step, fail with probability g • Goal: maximize expected reward collected before failure • A crude heuristic: • Expected number of steps until failure = 1/g • Set deadline = 1/g and apply Orienteering algorithm • Provides no guarantee on reward • Better formulation: “exponential-discounting” • Probability that robot is alive at time t = 2-t (say, g = ½) • Thus, if robot visits reward p at time t, expected reward collected = 2-tp • Maximize “discounted-reward” Fixed deadline (Orienteering) Discounted-reward • Can be solved using techniques in AI, if reward is • collected every time the robot visits a location • The one-time reward case: Discounted-reward TSP • 6.8-approximation [FOCS’03] Shuchi Chawla
0.2 0.5 0.3 0.1 0.6 0.5 0.5 0.3 0.2 0.8 A more general model: MDPs • Typically modeled as a Markov Decision Process • Current position of the robot summarized as a “state” • Several actions available in each state • Each action results in a new state with some probability Shuchi Chawla
0.2 0.5 0.3 0.1 0.6 0.5 0.5 0.3 0.2 0.8 Stochastic path-planning • Measure reward/time in expectation • Stochastic-TSP: find a strategy that takes minimum possible time in expectation to visit all nodes • Harder than the deterministic case: • The problems are likely to be PSPACE-hard • Generalize deterministic “directed” problems • My focus • Currently examining approximation for stochastic-TSP joint work with Blum, Kleinberg and McMahan • General techniques for transforming deterministic-case algorithms to their stochastic versions Shuchi Chawla
To summarize… • Path-planning: wide variety of highly applicable problems • Little known in terms of approximations • Provided the first approximations for some of them • Important future directions • Closing the gap between hardness and approximation • Stochastic models Shuchi Chawla
General research interests • Path-planning • Graph partitioning and clustering • Online algorithms • Database Privacy • Game theory [ICML’01, FOCS’02, SODA’05] • Clustering with qualitative information • Binary classification using the mincut algorithm • Sparsest cut finding bottlenecks [SODA’02, APPROX’03, SPAA’03] Scheduling, Routing, Data-structures for search [TCC’05] Goal: Protect privacy of individuals; preserve macroscopic properties Technique: “sanitization” via perturbation and summarization [EC’03, EC’04] Profit maximizing auctions in the Bayesian setting Location games Shuchi Chawla