430 likes | 698 Views
An Overview of Robot Behavior Control. with insight into AI-based and algorithm-based approaches. Agenda. What is this talk going to cover? What is behavior? Behavior control Basic control strategies Advantages and disadvantages of these strategies Hybrid strategies Behavior-based control
E N D
An Overview of Robot Behavior Control with insight into AI-based and algorithm-based approaches
Agenda • What is this talk going to cover? • What is behavior? • Behavior control • Basic control strategies • Advantages and disadvantages of these strategies • Hybrid strategies • Behavior-based control • Deliberation-based control • Hybrid strategies • Final remarks
What is behavior? • every robot has a goal • how to accomplish this goal? • good readings from sensors and good control of movement do not suffice • we need proper decision-making
What is behavior? • what to do on sensor input • how to coordinate with teammates • navigation • exploration • etc. • movement control • sensor control
What is behavior?Behavior control • Input: • sensory data • history of behavior • information from teammates • information about opponent • Output: • what to do next? • where to go • where to look • what to send to teammates • We will look at different methods for decision making and following these decisions
What is behavior?High-level behavioral algorithms • Most prominent problems • navigation to a point, with obstacles • exploring unknown terrain • task allocation • Many research done in this area • We will review some of the results
What is behavior?Low-level basic algorithms • Typical problems • how to walk • how to read sensor input • how to evaluate visual sensory input These are problems which we won’t discuss
Behavior controlbasic strategies • Two main approaches to behavior control: • Behavior-based control (reactive) • “world is the world’s best model” • simple actions as reactions to environment • complex behaviors emerge from simple ones • stateless • no communication between teammates, only observation • inspired biologically • emerging from the AI community • Deliberation-based control • careful planning of actions • maintaining state and synchronizing it with the environment • complex behaviors planned in advance • communication with teammates • prediction of opponent’s behavior • emerging from the algorithmic community
Behavior controldiscussion on behavior-based control • Advantages: • Simple controller, suitable for architectures with low performance • Easy implementation leading to rapid development • Easy to test and debug • Should adapt well to changing environmental conditions • Fast reaction time, well suited for dynamically changing situations e.g. (e.g. robot-soccer) • Provable low-level properties (collision-avoidance etc.) • Disadvantages: • Emergent behavior is impossible to predict • No provable properties about emergent behavior • Not suitable very well to less dynamic situations where goals are achieved in a long term (e.g. UGV navigation)
Behavior controldiscussion on deliberation-based control • Advantages: • Possibility to plan in advance for long term behavior • Complex behaviors are precisely defined and provable • Can take advantage of communication with mates • Possibility of learning and thus predicting the moves of the opponent • Disadvantages: • High hardware requirements (computationally intensive algorithms) • Possibility of loss of synchronization between internal state and environment • Problems hard to solve and implement • Can react too slow in very dynamic situations (e.g. robot-soccer)
Behavior controlhybrid strategies • The two basic strategies can be combined to hybrid ones • Basic behavior controlled by behavior-based strategies (low-level) • Deliberation-based methods define a high-level strategy • Advantages of both strategies can be combined
Behavior-based controlagenda • Typical methods: • Simple state machines and how to define them • Potential fields method • Formation control
Behavior-based controlsimplestate-machines • Most popular method of behavior control in dynamical systems • Used by GermanTeam 2002 and later • Sample definition: Goalie Goalie-before-kickoff Goalie-playing go-to-ball Return-togoal kick Position-inside-goal go-to-point stand
Behavior-based controlXABSL for defining behavior rules • Instead of defining behavioral aspects of software in plain code, usage of meta-languages • Software engineering defines UML, Petri-nets, high-level scripting etc. for modeling of behavior • XABSL (extensible agent behavior specification language) is defined by the German team • Syntax based on XML • Defines a state-automaton • Language constructs typical for a structural language (if, conditions) • Constructs for easy operation on the state-automaton (transitions) • Basic behaviors like “go-to-ball” defined in low-level language
Behavior-based controlXABSL for defining behavior rules • XABSL is transformed into Intermediate Code, which is executed on the AIBO by a low-level virtual machine • AIBO behaves according to the definitions given in XABSL, acting as a state-automaton
Behavior-based controldecisions in state machines • Sometimes decisions between certain behavior options must be made • These are based on evaluating utility functions for possible options • These utility functions can be influenced by non-determinism
Behavior-based controlPotential fields method • Objects either attract or repulse the robot • These forces constitute the potential field • Forces in the field are summed according to physical rules, so that one obtains the resultant force • The resultant force indicates the movement direction of the robot, optionally the force strength determines the movement speed • Advantages: • Smooth movement • Elegant solution, very easy to describe
Behavior-based controlPotential fields method Calculated resultant force, direction of movement Ball Opponent Attractive induced by ball Robot Repulsive force induced by opponent robot
Behavior-based controlPotential fields method – problems Robot Ball Ball Robot • Local minima
Behavior-based controlPotential fields method – problems • No passage between close objects Robot Ball
Behavior-based controlPotential fields method – problems Ball Ball Robot Robot • Oscillations
Behavior-based controlformation control • Formation control is important for terrain traversal, soccer … • Four robots travel in a predefined formation Column Line Diamond • Robots compute position and positions of others • Own formation position is calculated basing on • leader position • neighbor position • unit-center position
Behavior-based controlformation control • Robot tries to maintain formation, by staying inside of the dead zone • Inside of dead zone no additional formation maintaining performed • Inside of controlled zone speed vector into dead zone linearly dependent on distance from dead zone • If obstacles occur, the avoidance gains priority • As soon obstacle is surrounded, the robot tries to get into formation • Can be realized using potential field, with dead zone attracting and obstacles repelling Controlled zone Dead zone
Deliberation-based controlagenda • Typical methods: • Case based reasoning • Hidden Markov Models • Algorithmic approaches • task allocation • navigation to a point, with obstacles
Deliberation-based controlcase based reasoning • During soccer play similar situations can occur quite often • Case based reasoning allows a player to store the behavior of opponents and use it when a similar situation occurs once again • Sample: Robot with ball Goal Opponent
Deliberation-based controlcase based reasoning • Advantages: • Opponent behavior can be analyzed and player can adapt to its strategies • Disadvantages: • If opponent uses similar techniques, than the two CBR instances fight against each other, returning improper forecasts • No provable results • Highly memory and computational intensive • Learning process is needed • May be advantageous against simple opponents, but has no provable properties and fails against “intelligent” opponents
Deliberation-based controlHidden Markov Model method • As in CBR, the goal is to predict the behavior of the opponent • The HMM method: • assume that the opponent has a state machine and uses a set of common behaviors, like go-to-ball, intercept-ball … • for each behavior we define a model, which is a state machine with probabilities for transition from state to state • for every possible observation the model contains a probability that it occurs in a certain state • we cannot directly observe the state of the opponent so we instantiate HMM behavior models and look whether their execution matches the observations • thus we obtain probabilities that the opponent is in a certain state of a certain behavior
Deliberation-based controlHidden Markov Model method • Observations are • Distance of robot to ball • Robot ball manipulation • Distance of robot to goal … • The most interesting question is about the value of • Knowing the probability, we can derive some information about future behavior of opponent
Deliberation-based methodstask allocation • Task allocation is important when coordination of robots is needed • With robot soccer task allocation is mostly reduced to role assignment (first forward, supporting forward, defender) • Lot of research on multiprocessor task scheduling and similar assignments, which can be often translated to multi-robot scenarios • Models utilized for robot task scheduling: • Robots are heterogeneous • Tasks require specific skills • Tasks appear online • Communication is expensive and thus must be minimized • Computation power is sparse
Deliberation-based methodstask allocation uncommon • We look for efficient, online and distributed approximations for task-allocation • Taxonomy of task allocation problems: • ST-SR – single-task robots, single-robot tasks • ST-MR – single-task robots, multi-robot tasks • MT-SR – multi-task robots, single-robot tasks • MT-MR – multi-task robots, multi-robot tasks
Deliberation-based methodstask allocation – ST-SR setting • Model • Set M of workers, s. t. |M| = m • Set N of jobs, s. t. |N| = n jobs, with a weight wj for each job • skill rating, which defines the fitness of a worker for a job: • We want to find such an assignment of workers to jobs, s. t. a sum of the combination of utility function and job weight is maximized • Centralized ILP solvable by Hungarian Method gives runtime of O(mn2), but needs about n2messages to be exchanged • Distributed auction mechanisms achieve the same task with only O(n) messages
Deliberation-based methodstask allocation – online ST-SR setting • The previous model assumed an offline-setting • In reality the online version is much more likely to occur • BLE algorithm: • If any robot is unassigned, find the robot-task pair with highest utility and weight • Assign this robot to this task • Go on • This greedy strategy is 2-competitive to the optimal offline algorithm
Deliberation-based methodstask allocation – ST-MR setting • Also known as coalition formation • Now each job might require a specific skill which is possessed only by some robots • Transforming the coalition formation problem to SPP: • Let E be a set of all tasks and robots • Let F be a family of all robot-task pairs • u(f), where f is a set from F, is the utility for robot-task pair • SPP • Finite set E • Family F of subsets of E • Utility function u: F→R+ • Find a maximum-utility family X of elements in F, s.t. X is a partition of E
Deliberation-based methodstask allocation – ST-MR setting • SPP is NP-complete • But there are heuristics and approximations which give good practical results • Unfortunately these methods do not have a guaranteed approximation ratio, they only report how far the constructed solution is from the optimum for a particular problem instance
Deliberation-based methodsnavigation to a point • Model: • The robot should get from a source position to a target position traveling the smallest possible distance • There are obstacles with unknown position and size • Different assumptions about the abilities of sensors may be made • Visual sensors • Touch sensors • Important measures: • Ratio of distance obtained by algorithm and the optimum • Distance taking into account the sizes of obstacles
Deliberation-based methodsnavigation – D* algorithm Í B E • Model • Finite undirected graph G(V,E), most often a grid • Edge blocking • The edge blocking is unknown to the algorithm • The blocked edges cannot be traverse • Blocked edges can be detected only at adjacent vertices • D* algorithm • Assume that all the unknown terrain contains no blocked edges • Find shortest path • Try to go on this path • On blocked edges update terrain map, calculate new path
Deliberation-based methodsnavigation – D* algorithm • Sample edge-blocked graph S E
Deliberation-based methodsnavigation – D* algorithm • Performance of D* • Lower bound on competitive ratio • Upper bound on competitive ratio • Lower bound construction
Hybrid strategies • Two layers of execution • The lower runs with reactive behavior-based methods • The upper runs with deliberative methods • The lower layer assures fast reactions, obstacle avoidance etc. and can basically function without the help of the upper layer • The upper layer provides additional support to the lower layer, by analyzing the situation (e.g. case based reasoning) and giving “hints” to the lower layer • The hints are only supportive for the working of the behavior-based methods, i.e. they can be (partially) ignored • The hint can be modeled as a slight influence on a utility function of executing an option
Final remarks • What you should remember • Two basic strategies for behavior control • No clear indication which one is best • Many research in both areas, with deliberative having more strict proofs and behavior-based having more practical realization • Today’s results aren’t great • Practical realizations more often use simpler methods – there is a gap between the theoretical results and their implementation
Thank you for your attention! Jaroslaw Kutylowski Heinz Nixdorf Institut & Institut für Informatik Universität Paderborn Fürstenallee 11 33102 Paderborn Tel.: 0 52 51/60 64 68 Fax: 0 52 51/62 64 82 E-Mail: jarekk@upb.de http://www.upb.de/cs/ag-madh