300 likes | 348 Views
Explore path planning algorithms for efficient candy collection on Halloween. Learn about Orienteering, Traveling Salesman Problem, and applications in delivery, production, and robotics. Dive into approximation methods, rewards vs. time trade-offs, and dynamic programming techniques.
E N D
Algorithms for Path-Planning Shuchi Chawla Carnegie Mellon University
A trick-or-treat problem • It’s Halloween… Collect as much candy as possible between 6pm and 8pm • Goal: Get as much candy as possible • In what order should you visit houses? Shuchi Chawla
Path-planning • Informally… planning and ordering of tasks • Classic instance ― TravelingSalesmanProblem Find the shortest tour covering all given locations • A natural extension ― Orienteering Cover as many locations as possible by a given deadline • Manyvariants, applications: Shuchi Chawla
Pick up and delivery schedule, Friday, Jan 17 9:00 10:00 11:00 12:00 1:00 2:00 3:00 4:00 Up the elevator Office; Pick packages A & B Office; Pick packages C & D Package C, Downtown, *priority* Package B, Monroeville Package D, Waterfront Package A, Robinson Up the stairs Path-planning • Informally… planning and ordering of tasks • Classic instance ― TravelingSalesmanProblem • A natural extension ― Orienteering • Many variants, applications: • Delivery & distribution problems • Production planning, assembly analysis • Robot navigation • Needs to be solved at large scales in practice • Mostly NP-hard☹ Cannot find the optimal solution in “polynomial-time” Item X1: Value: $15 Deadline: 100 units by March 15 Requirements: 5 hrs. on Y1 7 hrs. on Y2 Scheduling constraints: < 1 hr. gap between Y1 and Y2 Machine Y2: Cost: $2 per hr. Scheduling constraints: > 30 mins. between X1 and X2 Shuchi Chawla
Path-planning: A brief history • Studied in Operations Research for the past 20-30 years (also known as Vehicle Routing) • Many variants: • multiple vehicles, stochastic demands, pickup-and-delivery • Techniques for Mixed Integer Programming • Stochastic planning problems in Robotics and AI • My focus: Approximation algorithms A 2-approximation: half-as-good as OPT; value ½ OPT Find a solution in polynomial-time with value OPT (a1) 1∕ • cutting plane methods branch and bound • simulated annealing genetic algorithms … • Focus mostly on exploration, position estimation • Some poly-time solvable problems studied e.g. shortest paths • Linear and Dynamic programming techniques Shuchi Chawla
Reward obtained Time taken Approximation Results • A reward vs. time trade-off A “quota” approximation OPT (2 hrs, ) (4 hrs, ) A “budget” approximation (2 hrs, ) Shuchi Chawla
Approximation Results • A reward vs. time trade-off • A budget on time; maximize reward • Orienteering • Deadline-TSP • TSP with Time-Windows • A quota on reward; minimize time • TSP • k-TSP • Min-Excess • Optimize a combination of reward and time • Prize-Collecting TSP • Discounted-Reward TSP • single deadline on time • different deadlines on different locations • different time windows for diff. locations • visit all locations • visit k locations • visit k locations, but minimize excess • minimize time plus reward foregone • max. reward, reward decreases with time Shuchi Chawla
Approximation Results • A reward vs. time trade-off • A budget on time; maximize reward • Orienteering • Deadline-TSP • TSP with Time-Windows • A quota on reward; minimize time • TSP • k-TSP • Min-Excess • Optimize a combination of reward and time • Prize-Collecting TSP • Discounted-Reward TSP This talk • ? • ? • ? • 3 • 3 log n • 3 log2 n FOCS’03:Blum, Karger, Chawla, Lane, Meyerson, Minkoff STOC’04: Bansal, Blum, Chawla, Meyerson Use structural properties and Dynamic Programming Use LP-rounding • 1.5 [Christofides ’76] • 2+[BRV99] [Garg99] [AK00] [CGRT03] … • 2+ • 1.5 [Christofides ’76] • 2+[BRV99] [Garg99] [AK00] [CGRT03] … • ? • 2 [Goemans Williamson ’92] • ? • 2 [Goemans Williamson ’92] • 6.75+ Shuchi Chawla
A road-map • An approximation for Orienteering • Current and future work • Robot navigation & path-planning in a stochastic world • Other research Shuchi Chawla
Back to candy-collection (Orienteering) • The givens A “map” G of locations, distances, start location s Rewards on locations Deadline D • To find A path that collects as much reward as possible by deadline D • NP-hard; we want to find an approximation In poly-time, find a path covering reward ⅓OPT by deadline D Shuchi Chawla
A flawed attempt: the greedy approach • Visit the closest location; continue until time over • Assume for simplicity – all locations have equal reward • Local decisions are bad – consider the “big picture” • treat clusters of locations as single entities with large reward Greedy (2 hrs, ) Optimal (2 hrs, ) Shuchi Chawla
A flawed attempt: the greedy approach • A single mistake – the very first step – could be disastrous! • Suppose we had twice or thrice the time … • Can we use this to approximate Orienteering? Greedy (2 hrs, ) Algorithm using twice the optimal length (4 hrs, ) Optimal (2 hrs, ) Recall: 2-approx to k-TSP … We know how to achieve this! Shuchi Chawla
k-TSP: Visit at least k reward as fast as possible Several approx. known best: 2-approx [Garg05] Orienteering: Visit as much reward as possible by time D No approximation known k-TSP versus Orienteering Equivalent at optimality; different in approximation: OPT k-TSP approx Orienteering approx Reward obtained 2 hrs. 4 hrs. Time taken Shuchi Chawla
Next attempt: Using k-TSP for Orienteering • If there exists a path: length = , reward • 2-approx to k-TSP gives: length , reward • Is there a good path with these properties? • Bad trade-off between length and reward! • Delving deeper does not help – • algorithms for k-TSP use a “primal-dual” subroutine [Goemans Williamson’92] • inherently aim for the low-hanging fruit first ½D OPT 1∕ D OPT 1∕ OPT k-TSP approx ALG OPT’ OPT’ k-TSP approx Orienteering approx Reward obtained OPT 1 hr. 2 hrs. 4 hrs. Time taken Shuchi Chawla
Next attempt: Using k-TSP for Orienteering • If there exists a path: length = , reward • 2-approx to k-TSP gives: length , reward • Is there a good path with these properties? • Yes, if we didn’t have to start from • Idea: Approximate one half of OPT ½D OPT 1∕ D OPT 1∕ • Need a better approximation than k-TSP Use only as much extra length as “saved” in the first part • Problem: If OPT collects most reward in its second half… • Length of path ≫D Shuchi Chawla
Key insight: Approximate “excess” • Problem: If OPT collects most reward in its second half? • Length of path ≫D • Need a better approximation than k-TSP • Excess of path = length of path – length of shortest path • Given a 2-approx for “excess”: • Divide OPT into 2 parts, each with equal excess • Approximate “excess” on the part with maximum reward Shuchi Chawla
s Shortest distance from start Length of path Approximating the excess • The Min-Excess problem: Given a map G, start s, end t, reward quota k Find path with reward k that minimizes excess = length – shortest distance from s to t t dist(s,t) length OPT excess ℓ length = shortest distance Min-Excess approx k-TSP approx Shuchi Chawla
s Shortest distance from start dist(s,t) length OPT excess Length of path Min-Excess approx k-TSP approx Approximating the excess: large excess • The easy case: dist(s,t) ≪ excess k-TSP gives a good approximation to Min-Excess t t OPT k-TSP approx Min-Excess approx ℓ Shuchi Chawla
s Shortest distance from start Length of path Approximating the excess: small excess • The hard case: dist(s,t) > excess, or, excess ≈ 0 OPT visits locations roughly in order of increasing distance from start • Key insight: If OPT visits locations in order of increasing distance, we can find it exactly using Dynamic Programming! t t OPT k-TSP approx v u ℓ Min-Excess approx k=5 ℓ k=4 u v For all locations v, and reward k: find and store best path from start to v with reward k To find best path for (v,k), – try all paths that end at some u and collect reward k-1 – Pick the one with smallest excess Shuchi Chawla
The large excess case: excess ≫ shortest dist Use k-MST The small excess case: excess ≈ 0 (OPT is “monotone”) Use Dynamic Programming Dynamic Program Use k-TSP Approximating the excess: combining them Gives a (2+)-approximation for Min-Excess [FOCS’03] What about the intermediate case? Patch segments using dynamic programming OPT wiggly wiggly monotone monotone monotone In order of increasing distance from start Shuchi Chawla
3 1 2 An algorithm for Orienteering • Construct a path from s to t, that has length D and collects maximum possible reward • Given a 3-approximation to min-excess: 1. Divide into 3 “equal-reward” parts (hypothetically) 2. Approximate the part with the smallest excess 3-approximation to orienteering An r-approximation for min-excess gives a r-approximation for Orienteering Excess of path from u to v (u,v) = ℓ(u,v)–d(u,v) Open: Given an r-approx for min-excess, can we get r-approx to Orienteering? OPT v2 v1 ALG Excess of ALG 1+2+3; Reward of ALG = ⅓ reward of OPT Shuchi Chawla
Open problems • Better approximations • Vehicle capacities • Precedence constraints • Faster approximations • How hard are they? • Hardness of approximation results for path-planning • Only small-constant-factor hardness known • Stochastic models • Map changes over time • Requests arrive stochastically • Robot Navigation Shuchi Chawla
An application: Robot Navigation • Robot’s task: deliver packages to various locations, collect samples, etc. • Planning with uncertainty • may run out of battery power • may crash into a wall • may face unforeseen obstacle that causes delay • Goal: perform as many tasks as possible before failure occurs; perform all tasks as fast as possible in expectation Shuchi Chawla
reward time A simple model for uncertainty • At each step, fail with probability g • Goal: maximize expected reward collected before failure • A crude heuristic: • Expected number of steps until failure = 1/g • Set deadline = 1/g and apply Orienteering algorithm • Provides no guarantee on reward • Better formulation: “exponential-discounting” • Probability that robot is alive at time t = 2-t (say, g = ½) • Thus, if robot visits reward p at time t, expected reward collected = 2-tp • Maximize “discounted-reward” Fixed deadline (Orienteering) Discounted-reward • Can be solved using techniques in AI, if reward is • collected every time the robot visits a location • The one-time reward case: Discounted-reward TSP • 6.8-approximation [FOCS’03] Shuchi Chawla
0.2 0.5 0.3 0.1 0.6 0.5 0.5 0.3 0.2 0.8 A more general model: MDPs • Typically modeled as a Markov Decision Process • Current position of the robot summarized as a “state” • Several actions available in each state • Each action results in a new state with some probability Shuchi Chawla
0.2 0.5 0.3 0.1 0.6 0.5 0.5 0.3 0.2 0.8 Stochastic path-planning • Measure reward/time in expectation • Stochastic-TSP: find a strategy that takes minimum possible time in expectation to visit all nodes • Harder than the deterministic case: • The problems are likely to be PSPACE-hard • Generalize deterministic “directed” problems • My focus • Currently examining approximation for stochastic-TSP joint work with Blum, Kleinberg and McMahan • General techniques for transforming deterministic-case algorithms to their stochastic versions Shuchi Chawla
To summarize… • Path-planning: wide variety of highly applicable problems • Little known in terms of approximations • Provided the first approximations for some of them • Important future directions • Closing the gap between hardness and approximation • Stochastic models Shuchi Chawla
General research interests • Path-planning • Graph partitioning and clustering • Online algorithms • Database Privacy • Game theory [ICML’01, FOCS’02, SODA’05] • Clustering with qualitative information • Binary classification using the mincut algorithm • Sparsest cut finding bottlenecks [SODA’02, APPROX’03, SPAA’03] Scheduling, Routing, Data-structures for search [TCC’05] Goal: Protect privacy of individuals; preserve macroscopic properties Technique: “sanitization” via perturbation and summarization [EC’03, EC’04] Profit maximizing auctions in the Bayesian setting Location games Shuchi Chawla