200 likes | 371 Views
Extending Algorithms for Mobile Robot Patrolling in the Presence of Adversaries to More Realistic Settings. Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni { basilico,ngatti,ceppi,amigoni }@ elet.polimi.it , thomas.rossi@mail.polimi.it. Outline. Background
E N D
Extending Algorithms for Mobile Robot Patrollingin the Presence of Adversaries to More Realistic Settings Nicola Basilico, Nicola Gatti, Thomas Rossi, Sofia Ceppi, and Francesco Amigoni {basilico,ngatti,ceppi,amigoni}@elet.polimi.it, thomas.rossi@mail.polimi.it
Outline • Background • State of the art • Basic model • Solving algorithm • Contributions • Modeling intruder’s movements • Modeling intruder’s visibility limitations • Complexity reduction techniques • Experimental results • Conclusions and Future Works
Part 1: Background
Related Works • The patrollingstrategyproblem: • The patrollingstrategydrives the robot in the patrolling task • Problem: givenanenvironment, compute the best patrollingstrategy • Approaches: • Notconsidering a modelof the adversary (the intruder) • Frequency/coveragebasedapproaches • Explicitlyconsidering a modelof the adversary (the intruder) • It can providebetterstrategies (Amigoniet a, IAT 2008) • Modelof the adversary • Withoutpreferences (Agmonet al., AAMAS 2008, perimeter-like environments) • Withpreferences (Paruchuriet al., AAMAS 08, fully connected environments)
Patrolling Setting • Timeisdiscretized in turns • Gridmapcomposedbyfree cells(white), obstaclecells(black) and targets(green circles) • Targets: cellswith some valueforbothplayers • Patroller: • Equippedwithsensorstodetectintrusions in the patrollingsetting • It can movebetweenadjacentvertexes in onetimeunit • Intruder: • Itobserves the patrollerremaininghiddenoutside the environment • It can decide toenter the environment at any turn • Foreach target T, the intruder mustspendtimedTtosuccessfullyattack the target • When attempting to attack target T at time t, the intruder can be detected during [t, t+dT)
Patrolling Strategy • Patrollingstrategy: itspecifies the nextmoveof the patroller at each turn • Randomizedstrategy: a probabilitydistributionover the nextmove, it can be the onlyeffectivestrategyagainstanobserving intruder • Objective: finding the optimal randomized patrolling strategy while considering a model of the adversary (the intruder) • Strongest intruder: a rational agent that knows the patrolling strategy and considers it when deciding its action • Approach: to study the interactions between patroller and intruder agents within a game-theoretical framework
The Patrolling Game P 1 2 3 4 5 move(10) move(12) move(7) 6 7 8 I I I … wait enter(13) … 9 10 12 13 … enter(1) P P P 1 turn … … … Game Outcomes • At turn k the indruderenterscellTwhen the patrolleris in cellG: enter-when(T,G) • If the patrollerdoesnotsensecellT in the interval[k, k+ dT)the intruder wins • Otherwise the intruder iscaptured and the patrollerwins • The intruder neverenters: stay-out
Solving the Game • The patrollingproblem can bemodeledas a leader-follower game • Twoplayers • The leader commitsto a strategy • The followerobservessuchcommitment and actsas a best responder • Patroller’s strategy: A = {αi,j}, whereαi,jis the probabilityofdoingmove(j) wheni is the currentnode • Intruder’s strategy: enter-when(T,G), enter in target T when the patrolleris in cellG • The optimal A can be derived by computing the equilibrium of the leader-follower game resorting to a bilevel optimization problem (Conitzer and Sandholm, 2006)
Solving the Game If the intruder’s actionistoattack target T, the patroller’s expected utility iscomputedas: P(intrusionT) *XT + (1 - P(intrusionT)) * X0 A’1,a1 A’2,a2 A’3,a3 A’n,an … maxEUp Forvery intruder’s actionai FindA’ suchthatEUpismaximum s.t. ai is best responseto A’ • P(intrusionT) depends on • the attacked target • the position of the patroller • the patrollingstrategy A*,a* Leader-Follower Equilibrium OptimalPatrollingStrategy
Part 2: Contributions
Objective • The basicmodelisgeneralbutitmakes a lotofsimplifyingassumption • E.g., the intruder can directlyenter in any target • We introduce twodifferentextensions in ordertomodel a more realisticpatrolling scenario • Werefine the intruder’s modelconsideringaspectsthat are notaddressed in game theoreticalpatrollingliterature • Weexperimentallyevaluate the computationalcomplexityof the extendedmodel and providetechniquesto reduce it
Intruder’s Movements Basicmodelassumption: the intruder can directlyenter in any target T The intruder’s strategyisrepresentedas: enter-when(T,C) Now the intruder’s strategyisrepresentedas: enter-when(P,C) C • The environment can beaccessedbyaccessareas • As soonas the patrolleris in C, the intruder: • Entersfromanaccess area • Follows a pathPfrom the access area to a target T, and thenstaystherefordTturnsto complete the intrusionattempt • The intrusionprobabilityofan intruder’s strategyhastobecomputed in a different way withrespectto the basicmodel
Reduction • We can reduce the computationalburdenbydiscardingplayers’ dominatedactions • An actionaisdominatedbyanactionbif the player preferstoundertakebindependentlyof the opponent‘s strategy • Patroller’s actionsreduction: • Smallersetting, lessvariables • Forcing the patrollerto cover shortestpathsbetweentargets • Intruder’s actionsreduction: • Lessoptimizationproblems, lessconstraintsforeachoptimizationproblem
Reduction • Indentify the minimal set ofpathsthat a rational intruder wouldconsiderin itsactionsenter-when(P,C)s • Obiouslyenter-when(P1, *)dominatesenter-when(P2, *) • P3isnotdominated: there can be a patrollingstrategysuchthatP3isbetterthanP1 • Weselectallirreduciblepaths, i.e., thosepathsthat do notstrictlycontainanyotherpath
Reduction • Indentify the minimal set ofcells{C}that a rational intruder wouldconsiderin itsactionsenter-when(P,C)s C1 C C2 • Obiouslyenter-when(P1, C)isdominatedbystay-out • enter-when(P1, C1)isdominatedbyenter-when(P1, C2): fromC2 the patrollershouldalways cover a longerdistancetoreach the target withindTturnsthanfromC1 • Foreveryirreduciblepathwefind the set {C} resortingto a treebasedsearchtechnique
Intruder’s LimitedObservationCapabilities • Basicmodel: the intruder can observe the patroller and derive a correctbelief on the patrollingstrategy • Limitedvisibility: whenacting the intruder has a limitedknowledgeabout the current position of the patroller • Hiis the set ofhiddencellswhenenteringfromaccess area i • Actionsenter-when(T,G) cannotbeperformedifGisanhiddencellbelongingtoHi • We introduce a state of the game s = <G,O> where: • Gis the last cellwhere the intruder saw the patroller • Ois the numberofturnsfromsuch last observation • Examples = <G,3> G • The intruder can compute a probabilitydistributionover the patroller’s position using the strategyitknows: ? ? ?
Intruder’s LimitedVisibility • Now the intruder’s strategyisrepresentedas : enter-when(T,s)wheresis a state • Todetermine non dominatedactionswehavetocompute the minimal set ofstates{s} • Compute the minimal set ofcells{c}toconsideraspatrollerpositions (like in the previous case) • Forevery c of{c}: • If c isnothiddenthens = <c,0>hastobeconsidered • If c ishidden, weconsider c’ fromwhich the patroller can reach c withoutpassingfromany non hiddencell • Weconsider state s = <c’,k>suchthat the probabilityfor the patrollerofbeing in cstartingfromc’ismaximumafterkturnssinceitdisappeared • We can finditbyresortingtoMarkovchainsproperties, in the examples = <c’,3> c c’
ExperimentalResults Optimizationproblems Total time (seconds) Total time (seconds) Optimizationproblems
Conclusions and Future Works • Conclusions: • Wepresented a game theoreticalmodeltofind the best patrollingstrategy in a patrollingsetting, togetherwith some extensionstocapture more realisticsituations • Future Works: • Furtherextensionstorefine the modelof the patroller • Real / simulated robot implementation • Multi-patrollerscenarios