410 likes | 427 Views
This paper explores intelligent agent strategies for solving dynamic vehicle routing problems, specifically the dynamic pickup and delivery problem with time windows and full truckloads. It considers the interaction of vehicle and shipper intelligence in making decisions, and suggests future research using approximate dynamic programming.
E N D
Intelligent agent strategies for dynamic vehicle routing problems Martijn MesAssistant ProfessorDepartment Operational Methods for Production and LogisticsSchool of Management and Governance
1. Problem introduction: - DPDPTW - Multi-agent system - Decisions & difficulties 2. Vehicle intelligence: Opportunity valuation 3. Shipper intelligence: Threshold policies 4. Combination: Interaction of vehicle and shipper strategies 5. Future research: Approximate Dynamic Programming Structure
Part 1 Problem introduction Introduction Vehicle intelligence Shipper intelligence Combination Future
Problem setting • Transportation network: • Network of nodes and arcs • Transportation jobs between the nodes • Full truckload • Dynamic arrival • Time-window restrictions • Vehicles to transport these loads • Dynamic Pickup and Delivery Problem with time-windows, full truckloads and stochastic job arrivals • Decisions: allocating and scheduling jobs Introduction Vehicle intelligence Shipper intelligence Combination Future
Solution approach: Multi-Agent System (MAS) …a computer system that is capable of independent (autonomous) action on behalf of its user or owner (Wooldridge, 2002)
MAS as we use it… • All vehicles are represented by vehicle agents • Vehicles decide upon their actions and maintain their own schedules • An auctioneer (the shipper agent) starts an auction for each new incoming job • Vehicles bid on these jobs based on their current status and schedule • The auctioneer evaluates all bids and determines the winner • The winning vehicle receives the new job Introduction Vehicle intelligence Shipper intelligence Combination Future
Asks for auctioning 20 Bid Winner Announcement 10 Auction example Shipper likes to send package Groningen Zwolle Amsterdam Enschede Utrecht Rotterdam Eindhoven Introduction Vehicle intelligence Shipper intelligence Combination Future
Decisions involved • Shipper: • Assignment decision: assign order to which vehicle? • Vehicle: • Pricing decision: accept order for which price? • Scheduling and routing decision: when to pickup and deliver the new load? this decision is supported by means of an auction… Introduction Vehicle intelligence Shipper intelligence Combination Future
Will it work Introduction Vehicle intelligence Shipper intelligence Combination Future
Difficulties • Allocation not optimal due to… • wrong estimation of real impact of order insertion • regrettable allocations;at time of allocation this seems the best option, later (due to uncertainties and new order arrivals) we regret this allocation • Therefore we propose some local corrections to improve the allocation in terms of… • individual benefits • overall logistics performance • We can do something at: • Job announcement • Bid calculation • Bid evaluation Introduction Vehicle intelligence Shipper intelligence Combination Future
To overcome the difficulties, we need some kind of look-ahead in: - bid pricing - bid evaluation Introduction Vehicle intelligence Shipper intelligence Combination Future
Two options: • Vehicle: opportunity valuation • take into account future job arrivals • for pricing, scheduling, and waiting decisions Shipper: dynamic threshold policy take into account price fluctuations due to new order arrivals for delaying (and breaking) commitments Introduction Vehicle intelligence Shipper intelligence Combination Future
Possible applications • Internal logistics • MAS control of AGVs within an underground logistics system at Amsterdam Airport Schiphol • MAS control of AGVs at an industrial bakery in the Netherlands • External logistics • Shippers with private fleets • Collaborative carriers • Multiple carriers and shippers participating in transportation procurement auctions Introduction Vehicle intelligence Shipper intelligence Combination Future
Part 2 • Vehicles: opportunity valuation • take into account future job arrivals • for pricing, scheduling, and waiting decisions Introduction Vehicle intelligence Shipper intelligence Combination Future
Possibly we also find an order on the routeEnschede-Utrecht Possibly it is better to be inAmsterdamthanGroningentwo hours from now High Low Transport intensity Importance of opportunity valuation (1) Pricing decisions Groningen Suppose the travel time and travel costs Enschede-Amsterdam and Enschede-Groningen are equal, also bid the same price? Amsterdam Enschede Utrecht Rotterdam Eindhoven A longer job will cover your fixed costs for a longer period, however multiple small jobs can result in higher profits depending on the auction mechanism Take into account the opportunities of a schedule! Introduction Vehicle intelligence Shipper intelligence Combination Future
8:00 11:00 10:00 19:00 20:00 High Low Transport intensity Importance of opportunity valuation (2) Routing and scheduling decisions Groningen Suppose you have won an order on route Utrecht-Amsterdam and Rotterdam-Eindhovenand are located in Enschede Routing: in which order to visit the cities? Amsterdam Enschede Scheduling: when to pickup and deliver? Create a large gap between delivery in Amsterdam and pickup in Rotterdam? Utrecht Rotterdam Eindhoven Take into account the opportunities of a schedule! Introduction Vehicle intelligence Shipper intelligence Combination Future
Move empty towardsUtrechtin anticipation of an orderUtrecht-Rotterdam, if you not receive such an order before 17:00, then move empty towardsRotterdam High Low Transport intensity Importance of opportunity valuation (3) Operational decisions Groningen After delivering the load in Amsterdam at 11:00 you have to decide what to do Drive directly toRotterdam (and wait there for ±6 hours) 11:00 Wait inAmsterdamuntil you win a new order which you can do before the orderRotterdam-Eindhoven, if you not received such an order before 17:00, then move empty towardsRotterdam Amsterdam Enschede Utrecht 19:00 Rotterdam Eindhoven 20:00 Take into account the opportunities of a schedule! Introduction Vehicle intelligence Shipper intelligence Combination Future
Opportunity valuation • All these questions/decisions have in common: • We weigh all possible ‘gaps’ between loaded moves against each other and against a certain ‘end state’ • Reason: new jobs are inserted either between 2 jobs or added to the end • So if we could value these periods we are done… Introduction Vehicle intelligence Shipper intelligence Combination Future
Opportunity valuation • So we derive 3 value functions: • End-value Ve(i,s,t)= expected revenue during a finite horizont, after arrival at schedule destinationia times from now • Gap-valueVg(i,j,s,t)= expected revenue during a periodtin a gap with starting nodei, end-nodej, and timesuntil arrival ati • Flexible gap-valueVg(i,j,s,t)= same, but nowtdenotes the maximum gap-length (gap elasticity) • Calculate using auction data & SDP Introduction Vehicle intelligence Shipper intelligence Combination Future
Cd(J1) Vg(B,C,4,2) Cd(J2) Cd(J3) Vg(D,B,4,10) Cd(J4) Ve(D,16,T-16) Location A B C A D B D Job 1 Gap 1 Job 2 Job 3 Gap 2 Job 4 Time 0 2 6 8 10 14 16 T - Direct costs Cd(Jl) for all jobs I • Waiting • Empty moves • New job insertions + Gap-value Vg(i,j,σ,t) for all gaps with start-node i, end-node j, length σand time-to-go t + End-value Ve(i,t) of a schedule destination i with time-to-go t Value functions • Vehicle schedule: Jobs with origin, destination, pickup and delivery times End • Value of a schedule: Introduction Vehicle intelligence Shipper intelligence Combination Future
SDP illustration: End-value Full move Empty move Waiting \ pro-active move B A C J1 J2 End 0 1 2 3 4 5 6 7 8 D C B A 0 1 2 3 4 5 6 7 8 Introduction Vehicle intelligence Shipper intelligence Combination Future
Vg(B,C,10,2) Cd(Job3) Vg(B,C,4,2) Vg(D,C,4,8) Location A B C B Job 1 Gap 1 Job 2 End C D Time 0 2 12 14 T Gap 2 Job 3 Gap 3 • Price =Cd(Job3)+Vg(B,C,10,2)-Vg(B,C,4,2) -Vg(D,C,4,8) 6 8 • Scheduling = Choose the pickup time of the new job which result in the lowest bid price Using the value functions • Pricing and scheduling: Location A B C B Job 1 Gap 1 Job 3 Job 2 End Job 3 Time 0 2 12 14 T Introduction Vehicle intelligence Shipper intelligence Combination Future
X Y Gap 1 Job 2 g h t Weigh gap-values with end-values Location A B Job 1 End Time 0 2 2+T Introduction Vehicle intelligence Shipper intelligence Combination Future
zero gap length optimal gap length slightly longer gap equal to empty travel time From B to C g h t Weigh gap-values with end-values Location A B X D Job 1 Gap 1 Job 2 End Time 0 2 2+T Introduction Vehicle intelligence Shipper intelligence Combination Future
Gap-value might be positive for unattractive end-nodes, such a job will serve as a ‘backup’ for arriving at an unattractive node Gap-value is zero for origin equal to destination Gap-value increase with increasing elasticity, because the probability that the empty move will be replaced by a loaded one increases Elasticity should be high enough for an empty move Illustration flexible gap-values Gap-value for different start-nodes with unattractive end-node Job 1 Job 2 Job 2 Gap elasticity Introduction Vehicle intelligence Shipper intelligence Combination Future
Results • Opportunity valuation increases the logistic performance (in terms of profits, capacity utilization and delivery reliability) with respect to: • the system wide performance = savings of 10% • individual benefits = profit of one ‘smart’ player higher than the total profit of his 9 competitors • Explanations: • gaps are effectively created to avoid empty moves • unattractive jobs are scheduled later (increasing the probability of combining this job with another job) • ‘smart’ carriers tend to select only the most profitable jobs • More information: Mes, M.R.K., M.C. van der Heijden, and P. Schuur (2008). Look-ahead strategies for dynamic pickup and delivery problems. OR Spectrum. Introduction Vehicle intelligence Shipper intelligence Combination Future
Part 3 Shipper: dynamic threshold policy take into account price fluctuations due to new order arrivals for delaying and breaking commitments Introduction Vehicle intelligence Shipper intelligence Combination Future
Dynamic threshold policy (1) • Shipper has to do some bid evaluation: • accept best bid or not • To support this decision we use a threshold policy • If the best bid is below a certain threshold price it is accepted, otherwise • auction stays open(continuous auctions) • a new auction will be started some time period later(repeated auctions) • This threshold price is given by the expected price after rejecting the best bid • Literature: Optimal auctions & Optimal stopping Introduction Vehicle intelligence Shipper intelligence Combination Future
Dynamic threshold policy (2) • The threshold prices are given by a threshold function Vt(σ,t,o,d) • timeσuntil latest pickup time • travel distancet • Origin- and destination regiono, d • We use SDP to calculate this function • Important aspects • Time-dependent mean bid prices • Time-dependent variances in bid prices • Correlated bids • Censored observations w.r.t. the penalty costs Introduction Vehicle intelligence Shipper intelligence Combination Future
Breaking commitments • Besides delaying commitments (by the use of reserve/threshold prices) it is also possible to break commitments • The decommitment policy: • Vehicles are allowed to decommit from an agreement against certain penalties • Vehicles decommit whenever the expected profit for a new job is higher than the profit for an old job minus the decommitment penalty • These penalties are set by the shipper and reflect the extra costs a shipper expect to make when re-auctioning a job later (so there is some equivalence between both policies) Introduction Vehicle intelligence Shipper intelligence Combination Future
Results • If only one player uses the proposed policies, his costs per job are 20-30% lower than those who did not use the policies. • The two policies are complementary, however, the combination requires a lot of computation time. • If we use the proposed policies for only 1% of the jobs, the total costs are being reduced with more than 1%. • If more jobs are auctioned in a ‘clever’ way, learning becomes more difficult. Mes, M.R.K., M.C. van der Heijden, and P.C. Schuur (2008). Dynamic threshold policy for delaying and breaking commitments in transportation auctions. Transportation Research Part C. Introduction Vehicle intelligence Shipper intelligence Combination Future
Part 4 Interaction of vehicle and shipper strategies Introduction Vehicle intelligence Shipper intelligence Combination Future
Interaction of vehicle and shipper strategies: back to MAS • Vehicle: opportunity valuation • take into account future job arrivals • for pricing, scheduling, and waiting decisions Shipper: dynamic threshold policy take into account price fluctuations due to new order arrivals for delaying (and breaking) commitments Introduction Vehicle intelligence Shipper intelligence Combination Future
Approach Introduction Vehicle intelligence Shipper intelligence Combination Future
Problems • Each player has to incorporate: • opponents’ behavior(i.e. a carrier takes into account whether a shipper uses threshold prices) • competitors’ behavior(i.e. a carriers takes into account whether other carriers value opportunities) • Players have to learn this… • Learning problems: • Long learning phase • Increasing bid prices • Fluctuations in bid prices • Luckily, these problems can be fixed… Introduction Vehicle intelligence Shipper intelligence Combination Future
Some results Relative savings of various policies compared to a myopic insertion strategy: Introduction Vehicle intelligence Shipper intelligence Combination Future
Conclusions (1/2) • Savings of 10-20% with combination of policies • Each policy has its own benefits, e.g. • Opportunity valuation → unbalanced networks • Dynamic threshold policy → long time-windows and low job arrival rate • Savings are 52% of savings from MIP approach • However, our savings are achieved without ‘significant’ additional computation time • But still, the difference in performance gives rise to further research… Introduction Vehicle intelligence Shipper intelligence Combination Future
Part 5 Future research Introduction Vehicle intelligence Shipper intelligence Combination Future
Disadvantages of our SDP approach • Difficult to add all kind of model details (e.g. driver regulations and time-dependent travel times). • With increasing problem sizes, the time needed to calculate the value functions increases drastically. • In highly dynamic environments we might use outdated or even the wrong value functions (and we might never discover this discrepancy) Introduction Vehicle intelligence Shipper intelligence Combination Future
Possible solution • Learn the value functions instead of calculating them • Avoid difficult modeling issues (e.g. modeling opponents’ behavior) • Avoid using wrong value functions • To learn the value functions: • ADP \ RL (temporal difference learning) • To speed up the learning process and to reduce computation time: • Value function approximation (piecewise linear functions, KNN, CMACs) • Use SDP as starting point Introduction Vehicle intelligence Shipper intelligence Combination Future
Questions? Martijn Mes University of Twente School of Management and Governance Operational Methods for Production and Logistics Phone: +31-534894062 Email: m.r.k.mes@utwente.nl Web: http://mb.utwente.nl/ompl/staff/Mes/