1 / 41

Intelligent agent strategies for dynamic vehicle routing problems

This paper explores intelligent agent strategies for solving dynamic vehicle routing problems, specifically the dynamic pickup and delivery problem with time windows and full truckloads. It considers the interaction of vehicle and shipper intelligence in making decisions, and suggests future research using approximate dynamic programming.

sgula
Download Presentation

Intelligent agent strategies for dynamic vehicle routing problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intelligent agent strategies for dynamic vehicle routing problems Martijn MesAssistant ProfessorDepartment Operational Methods for Production and LogisticsSchool of Management and Governance

  2. 1. Problem introduction: - DPDPTW - Multi-agent system - Decisions & difficulties 2. Vehicle intelligence: Opportunity valuation 3. Shipper intelligence: Threshold policies 4. Combination: Interaction of vehicle and shipper strategies 5. Future research: Approximate Dynamic Programming Structure

  3. Part 1 Problem introduction Introduction Vehicle intelligence Shipper intelligence Combination Future

  4. Problem setting • Transportation network: • Network of nodes and arcs • Transportation jobs between the nodes • Full truckload • Dynamic arrival • Time-window restrictions • Vehicles to transport these loads • Dynamic Pickup and Delivery Problem with time-windows, full truckloads and stochastic job arrivals • Decisions: allocating and scheduling jobs Introduction Vehicle intelligence Shipper intelligence Combination Future

  5. Solution approach: Multi-Agent System (MAS) …a computer system that is capable of independent (autonomous) action on behalf of its user or owner (Wooldridge, 2002)

  6. MAS as we use it… • All vehicles are represented by vehicle agents • Vehicles decide upon their actions and maintain their own schedules • An auctioneer (the shipper agent) starts an auction for each new incoming job • Vehicles bid on these jobs based on their current status and schedule • The auctioneer evaluates all bids and determines the winner • The winning vehicle receives the new job Introduction Vehicle intelligence Shipper intelligence Combination Future

  7. Asks for auctioning 20 Bid Winner Announcement 10 Auction example Shipper likes to send package Groningen Zwolle Amsterdam Enschede Utrecht Rotterdam Eindhoven Introduction Vehicle intelligence Shipper intelligence Combination Future

  8. Decisions involved • Shipper: • Assignment decision: assign order to which vehicle? • Vehicle: • Pricing decision: accept order for which price? • Scheduling and routing decision: when to pickup and deliver the new load? this decision is supported by means of an auction… Introduction Vehicle intelligence Shipper intelligence Combination Future

  9. Will it work Introduction Vehicle intelligence Shipper intelligence Combination Future

  10. Difficulties • Allocation not optimal due to… • wrong estimation of real impact of order insertion • regrettable allocations;at time of allocation this seems the best option, later (due to uncertainties and new order arrivals) we regret this allocation • Therefore we propose some local corrections to improve the allocation in terms of… • individual benefits • overall logistics performance • We can do something at: • Job announcement • Bid calculation • Bid evaluation Introduction Vehicle intelligence Shipper intelligence Combination Future

  11. To overcome the difficulties, we need some kind of look-ahead in: - bid pricing - bid evaluation Introduction Vehicle intelligence Shipper intelligence Combination Future

  12. Two options: • Vehicle: opportunity valuation • take into account future job arrivals • for pricing, scheduling, and waiting decisions Shipper: dynamic threshold policy take into account price fluctuations due to new order arrivals for delaying (and breaking) commitments Introduction Vehicle intelligence Shipper intelligence Combination Future

  13. Possible applications • Internal logistics • MAS control of AGVs within an underground logistics system at Amsterdam Airport Schiphol • MAS control of AGVs at an industrial bakery in the Netherlands • External logistics • Shippers with private fleets • Collaborative carriers • Multiple carriers and shippers participating in transportation procurement auctions Introduction Vehicle intelligence Shipper intelligence Combination Future

  14. Part 2 • Vehicles: opportunity valuation • take into account future job arrivals • for pricing, scheduling, and waiting decisions Introduction Vehicle intelligence Shipper intelligence Combination Future

  15. Possibly we also find an order on the routeEnschede-Utrecht Possibly it is better to be inAmsterdamthanGroningentwo hours from now High Low Transport intensity Importance of opportunity valuation (1) Pricing decisions Groningen Suppose the travel time and travel costs Enschede-Amsterdam and Enschede-Groningen are equal, also bid the same price? Amsterdam Enschede Utrecht Rotterdam Eindhoven A longer job will cover your fixed costs for a longer period, however multiple small jobs can result in higher profits depending on the auction mechanism Take into account the opportunities of a schedule! Introduction Vehicle intelligence Shipper intelligence Combination Future

  16. 8:00 11:00 10:00 19:00 20:00 High Low Transport intensity Importance of opportunity valuation (2) Routing and scheduling decisions Groningen Suppose you have won an order on route Utrecht-Amsterdam and Rotterdam-Eindhovenand are located in Enschede Routing: in which order to visit the cities? Amsterdam Enschede Scheduling: when to pickup and deliver? Create a large gap between delivery in Amsterdam and pickup in Rotterdam? Utrecht Rotterdam Eindhoven Take into account the opportunities of a schedule! Introduction Vehicle intelligence Shipper intelligence Combination Future

  17. Move empty towardsUtrechtin anticipation of an orderUtrecht-Rotterdam, if you not receive such an order before 17:00, then move empty towardsRotterdam High Low Transport intensity Importance of opportunity valuation (3) Operational decisions Groningen After delivering the load in Amsterdam at 11:00 you have to decide what to do Drive directly toRotterdam (and wait there for ±6 hours) 11:00 Wait inAmsterdamuntil you win a new order which you can do before the orderRotterdam-Eindhoven, if you not received such an order before 17:00, then move empty towardsRotterdam Amsterdam Enschede Utrecht 19:00 Rotterdam Eindhoven 20:00 Take into account the opportunities of a schedule! Introduction Vehicle intelligence Shipper intelligence Combination Future

  18. Opportunity valuation • All these questions/decisions have in common: • We weigh all possible ‘gaps’ between loaded moves against each other and against a certain ‘end state’ • Reason: new jobs are inserted either between 2 jobs or added to the end • So if we could value these periods we are done… Introduction Vehicle intelligence Shipper intelligence Combination Future

  19. Opportunity valuation • So we derive 3 value functions: • End-value Ve(i,s,t)= expected revenue during a finite horizont, after arrival at schedule destinationia times from now • Gap-valueVg(i,j,s,t)= expected revenue during a periodtin a gap with starting nodei, end-nodej, and timesuntil arrival ati • Flexible gap-valueVg(i,j,s,t)= same, but nowtdenotes the maximum gap-length (gap elasticity) • Calculate using auction data & SDP Introduction Vehicle intelligence Shipper intelligence Combination Future

  20. Cd(J1) Vg(B,C,4,2) Cd(J2) Cd(J3) Vg(D,B,4,10) Cd(J4) Ve(D,16,T-16) Location A B C A D B D Job 1 Gap 1 Job 2 Job 3 Gap 2 Job 4 Time 0 2 6 8 10 14 16 T - Direct costs Cd(Jl) for all jobs I • Waiting • Empty moves • New job insertions + Gap-value Vg(i,j,σ,t) for all gaps with start-node i, end-node j, length σand time-to-go t + End-value Ve(i,t) of a schedule destination i with time-to-go t Value functions • Vehicle schedule: Jobs with origin, destination, pickup and delivery times End • Value of a schedule: Introduction Vehicle intelligence Shipper intelligence Combination Future

  21. SDP illustration: End-value Full move Empty move Waiting \ pro-active move B A C J1 J2 End 0 1 2 3 4 5 6 7 8 D C B A 0 1 2 3 4 5 6 7 8 Introduction Vehicle intelligence Shipper intelligence Combination Future

  22. Vg(B,C,10,2) Cd(Job3) Vg(B,C,4,2) Vg(D,C,4,8) Location A B C B Job 1 Gap 1 Job 2 End C D Time 0 2 12 14 T Gap 2 Job 3 Gap 3 • Price =Cd(Job3)+Vg(B,C,10,2)-Vg(B,C,4,2) -Vg(D,C,4,8) 6 8 • Scheduling = Choose the pickup time of the new job which result in the lowest bid price Using the value functions • Pricing and scheduling: Location A B C B Job 1 Gap 1 Job 3 Job 2 End Job 3 Time 0 2 12 14 T Introduction Vehicle intelligence Shipper intelligence Combination Future

  23. X Y Gap 1 Job 2 g h t Weigh gap-values with end-values Location A B Job 1 End Time 0 2 2+T Introduction Vehicle intelligence Shipper intelligence Combination Future

  24. zero gap length optimal gap length slightly longer gap equal to empty travel time From B to C g h t Weigh gap-values with end-values Location A B X D Job 1 Gap 1 Job 2 End Time 0 2 2+T Introduction Vehicle intelligence Shipper intelligence Combination Future

  25. Gap-value might be positive for unattractive end-nodes, such a job will serve as a ‘backup’ for arriving at an unattractive node Gap-value is zero for origin equal to destination Gap-value increase with increasing elasticity, because the probability that the empty move will be replaced by a loaded one increases Elasticity should be high enough for an empty move Illustration flexible gap-values Gap-value for different start-nodes with unattractive end-node Job 1 Job 2 Job 2 Gap elasticity Introduction Vehicle intelligence Shipper intelligence Combination Future

  26. Results • Opportunity valuation increases the logistic performance (in terms of profits, capacity utilization and delivery reliability) with respect to: • the system wide performance = savings of 10% • individual benefits = profit of one ‘smart’ player higher than the total profit of his 9 competitors • Explanations: • gaps are effectively created to avoid empty moves • unattractive jobs are scheduled later (increasing the probability of combining this job with another job) • ‘smart’ carriers tend to select only the most profitable jobs • More information: Mes, M.R.K., M.C. van der Heijden, and P. Schuur (2008). Look-ahead strategies for dynamic pickup and delivery problems. OR Spectrum. Introduction Vehicle intelligence Shipper intelligence Combination Future

  27. Part 3 Shipper: dynamic threshold policy take into account price fluctuations due to new order arrivals for delaying and breaking commitments Introduction Vehicle intelligence Shipper intelligence Combination Future

  28. Dynamic threshold policy (1) • Shipper has to do some bid evaluation: • accept best bid or not • To support this decision we use a threshold policy • If the best bid is below a certain threshold price it is accepted, otherwise • auction stays open(continuous auctions) • a new auction will be started some time period later(repeated auctions) • This threshold price is given by the expected price after rejecting the best bid • Literature: Optimal auctions & Optimal stopping Introduction Vehicle intelligence Shipper intelligence Combination Future

  29. Dynamic threshold policy (2) • The threshold prices are given by a threshold function Vt(σ,t,o,d) • timeσuntil latest pickup time • travel distancet • Origin- and destination regiono, d • We use SDP to calculate this function • Important aspects • Time-dependent mean bid prices • Time-dependent variances in bid prices • Correlated bids • Censored observations w.r.t. the penalty costs Introduction Vehicle intelligence Shipper intelligence Combination Future

  30. Breaking commitments • Besides delaying commitments (by the use of reserve/threshold prices) it is also possible to break commitments • The decommitment policy: • Vehicles are allowed to decommit from an agreement against certain penalties • Vehicles decommit whenever the expected profit for a new job is higher than the profit for an old job minus the decommitment penalty • These penalties are set by the shipper and reflect the extra costs a shipper expect to make when re-auctioning a job later (so there is some equivalence between both policies) Introduction Vehicle intelligence Shipper intelligence Combination Future

  31. Results • If only one player uses the proposed policies, his costs per job are 20-30% lower than those who did not use the policies. • The two policies are complementary, however, the combination requires a lot of computation time. • If we use the proposed policies for only 1% of the jobs, the total costs are being reduced with more than 1%. • If more jobs are auctioned in a ‘clever’ way, learning becomes more difficult. Mes, M.R.K., M.C. van der Heijden, and P.C. Schuur (2008). Dynamic threshold policy for delaying and breaking commitments in transportation auctions. Transportation Research Part C. Introduction Vehicle intelligence Shipper intelligence Combination Future

  32. Part 4 Interaction of vehicle and shipper strategies Introduction Vehicle intelligence Shipper intelligence Combination Future

  33. Interaction of vehicle and shipper strategies: back to MAS • Vehicle: opportunity valuation • take into account future job arrivals • for pricing, scheduling, and waiting decisions Shipper: dynamic threshold policy take into account price fluctuations due to new order arrivals for delaying (and breaking) commitments Introduction Vehicle intelligence Shipper intelligence Combination Future

  34. Approach Introduction Vehicle intelligence Shipper intelligence Combination Future

  35. Problems • Each player has to incorporate: • opponents’ behavior(i.e. a carrier takes into account whether a shipper uses threshold prices) • competitors’ behavior(i.e. a carriers takes into account whether other carriers value opportunities) • Players have to learn this… • Learning problems: • Long learning phase • Increasing bid prices • Fluctuations in bid prices • Luckily, these problems can be fixed… Introduction Vehicle intelligence Shipper intelligence Combination Future

  36. Some results Relative savings of various policies compared to a myopic insertion strategy: Introduction Vehicle intelligence Shipper intelligence Combination Future

  37. Conclusions (1/2) • Savings of 10-20% with combination of policies • Each policy has its own benefits, e.g. • Opportunity valuation → unbalanced networks • Dynamic threshold policy → long time-windows and low job arrival rate • Savings are 52% of savings from MIP approach • However, our savings are achieved without ‘significant’ additional computation time • But still, the difference in performance gives rise to further research… Introduction Vehicle intelligence Shipper intelligence Combination Future

  38. Part 5 Future research Introduction Vehicle intelligence Shipper intelligence Combination Future

  39. Disadvantages of our SDP approach • Difficult to add all kind of model details (e.g. driver regulations and time-dependent travel times). • With increasing problem sizes, the time needed to calculate the value functions increases drastically. • In highly dynamic environments we might use outdated or even the wrong value functions (and we might never discover this discrepancy) Introduction Vehicle intelligence Shipper intelligence Combination Future

  40. Possible solution • Learn the value functions instead of calculating them • Avoid difficult modeling issues (e.g. modeling opponents’ behavior) • Avoid using wrong value functions • To learn the value functions: • ADP \ RL (temporal difference learning) • To speed up the learning process and to reduce computation time: • Value function approximation (piecewise linear functions, KNN, CMACs) • Use SDP as starting point Introduction Vehicle intelligence Shipper intelligence Combination Future

  41. Questions? Martijn Mes University of Twente School of Management and Governance Operational Methods for Production and Logistics Phone: +31-534894062 Email: m.r.k.mes@utwente.nl Web: http://mb.utwente.nl/ompl/staff/Mes/

More Related