501 likes | 680 Views
Multi-Agent Systems and Distributed AI 2de Semester, Friday 13:00-15:00 h. Lab course on Thursdays. College material. Book : A Concise Introduction to Multiagent Systems and Distributed AI by dr. Nikos Vlassis + Handout on Bayesian networks. Schedule. First block
E N D
Multi-Agent Systems and Distributed AI 2de Semester, Friday 13:00-15:00 h Lab course on Thursdays dr. ir. M. Maris, MAS & DI
College material • Book: • A Concise Introduction to Multiagent Systems and Distributed AI • by dr. Nikos Vlassis • +Handout on Bayesian networks dr. ir. M. Maris, MAS & DI
Schedule First block 9 feb: Introduction and Rational agents (ch 1 and 2) 16 feb: Strategic games (ch 3) 23 feb: Multi-agent coordination (ch 4) 2 mrt: Common knowledge (ch 5) 9 mrch: Multiagent learning (ch 8) 16 mrch: Bayesian networks and applications to multi-agents (handouts). 23 mrch: Communication and Mechanism design (ch 6 and 7) Second block (from 13th of April): Student presentations Website: http://staff.science.uva.nl/~mmaris/MAS.html dr. ir. M. Maris, MAS & DI
Lab Course dr. ir. M. Maris, MAS & DI
Multi-agent Systems (MAS) & Distributed AI Introduction dr. ir. M. Maris, MAS & DI
Agents, what are these? Intelligent devices Robots Heating systems Mobile phones Intelligent software Searchbots Expert systems Help functions ... dr. ir. M. Maris, MAS & DI
.... as in Microsoft Word dr. ir. M. Maris, MAS & DI
Basic Agent Types • Reflex (react only on input stimuli) • Proactive (have internal state) • Proactive and Social (they can choose to communicate with other agents if necessary to achieve their goals) dr. ir. M. Maris, MAS & DI
Reflex agent AGENT action output sensor input Environment dr. ir. M. Maris, MAS & DI
Proactive agent AGENT reasoning, planning action output sensor input Environment dr. ir. M. Maris, MAS & DI
AGENT planning, reasoning AGENT planning, reasoning AGENT planning, reasoning sensor input sensor input sensor input action output action output action output Environment Environment Environment Proactive and social agents (MAS) dr. ir. M. Maris, MAS & DI
Characteristics of MAS • Agent design • Environment • Perception • Control • Knowledge • Communication dr. ir. M. Maris, MAS & DI
Agent design Agents are versatile • Hardware differences (robots). • Software differences (softbots). So: Heterogeneous vs. homogeneous agents. Agent heterogeneity can affect perception, decision making, etc. dr. ir. M. Maris, MAS & DI
Definition of an agent There are many definitions of an intelligent agent. Here the following is adopted: An intelligent agent is a rational decision maker who is able to perceive some external environment and act autonomously upon it. Important points: • Rationality means optimizing a performance measure. • Autonomy means using perception to guide behavior. Therefore, we are mostly interested in computational agents. dr. ir. M. Maris, MAS & DI
Environment • In traditional AI, an agent's environment is assumed static. • In a MAS, the presence of multiple agents makes the environment dynamic because they interact with it. • This complicates the mathematical analysis of algorithms. • The question arises, what should be handled as environment and what as agent? dr. ir. M. Maris, MAS & DI
device device device device device device device device device device Centralized versus Distributed systems Centralized Distributed Host computer Host computer dr. ir. M. Maris, MAS & DI
A2 A1 A3 A6 A4 A5 Host computer Multiagent system dr. ir. M. Maris, MAS & DI
Perception In a MAS, sensor data is distributed: • Spatially, appear at different locations. • Temporally, arrive at different times. • Semantically, require different interpretations. The world state is partially observable by each agent. The problem of combining perceptions is called sensor fusion. dr. ir. M. Maris, MAS & DI
Control • In a MAS, control is distributed. • Each agent has to choose an action (more or less) by himself. • Game theory studies distributed decision making, • In a cooperative MAS the agents must coordinate their actions. • Coordination ensures that individual actions result in good joint actions. dr. ir. M. Maris, MAS & DI
Knowledge • Each agent in a MAS can possess certain knowledge. • Each agent should know something about the knowledge of the others. • A fact is common knowledge if all agents know it, all agents know that they all know it, etc. • Methods for coordination often exploit this property. dr. ir. M. Maris, MAS & DI
Communication • Interaction is often associated with some form of communication. • Coordinating and negotiating agents may use communication. • What language should the agents speak? • What protocols to use for message transmission? dr. ir. M. Maris, MAS & DI
Agent applications • Internet (e-commerce) • Robotics: for multi-robot localization and mapping • Games: agents with human-like behaviors • Robot soccer: realistic testbed for testing theories and algorithms • Social sciences: simulating social phenomena • Traffic control: tracking cars with many cameras dr. ir. M. Maris, MAS & DI
Challenging issues • Decompose problems into subtasks. • Deal with distributed perception. • Implement decentralized control and coordination. • Multi-agent planning and learning. • Represent knowledge. • Develop communication languages and protocols. • Enable agents to negotiate. • Enable organizational structures, e.g., teams. • Ensure stable system behavior. (to mention a few) dr. ir. M. Maris, MAS & DI
Coffeebreak dr. ir. M. Maris, MAS & DI
Example of cooperating agents dr. ir. M. Maris, MAS & DI
How do they work together?They share a “world model” dr. ir. M. Maris, MAS & DI
Real world example MAS using a world model: Robot soccer dr. ir. M. Maris, MAS & DI
How to act? The agents need to observe the environment and the other agents and must select an action. How do they do that? dr. ir. M. Maris, MAS & DI
Decision theory • Decision theory deals with the problem of optimal action selection. • At each time t the agent has to decide on its current action at. • There is an environment or world that is outside the agent, and that is affected by at. • In principle, an optimal decision should depend on two things: • The past: what the agent did before time t. • The future: what is going to happen next. dr. ir. M. Maris, MAS & DI
How many perceptions are needed? To behave rationally at any time step t, an agent must map its complete history of perceptions o and actions a to an optimal new action at (a0, o1, a1, o2, a2, … , ot) = at The function is called the policy of the agent. dr. ir. M. Maris, MAS & DI
Reflex agents A reflex agent just maps its current perception ot to a new action at , ignoring the past: (ot) = at Such a policy is called reactive or memoryless. How successful such a reflex agent can be? dr. ir. M. Maris, MAS & DI
Discrete vs. continuous worlds A discrete world consists of a finite number of states, e.g., S = {(1, 1), (1, 2), …, (4, 3)} A continuous world can have infinitely many states, e.g., for a translating-rotating robot on the plane holds S = 3. dr. ir. M. Maris, MAS & DI
Observable vs. partially observable world A world is called observable to an agent if the current perception ot of the agent provides complete information about the current world state st: st = ot In a partially observable world the current perception ot provides only partial information about the world state, in the form of a probability distribution over all world states: ot P(s| ot) with P(s| ot) = 1 sS dr. ir. M. Maris, MAS & DI
Reflex agents in an observable world Recall that a reflex agent maps perceptions ot to actions at : (ot) = at Also recall that in an observable world, each perception ot fully reveals the world state: st = ot. Thus, a reflex agent in an observable world has a reactive policy in the form: (st) = at It is a direct mapping from world states to actions. How good such a policy can be? dr. ir. M. Maris, MAS & DI
The Markov property • A reflex agent in an observable world uses a policy in the form (st) = at. • This seems bad: the past and the future are both ignored! • However, in many cases we don't need the past: the current state st summarizes all important information about the past. • A world state that retains all relevant information about the past is said to be Markov, or to have the Markov property. Ok, but what about the future? dr. ir. M. Maris, MAS & DI
Actions and transition models Each time step the agent can choose an action a from a discrete set A of actions, e.g., A ={Up, Down, Left, Right} A transition model specifies how the world changes when an action is executed. dr. ir. M. Maris, MAS & DI
Deterministic vs. stochastic world In a deterministic world, the transition model maps a state-action pair to a single new state: (si,a) sj In a stochastic world, the world model maps a state-action pair to a probability distribution over states: (si,a) P(s’ | si,a) with P(s | si,a) = 1 sS The probability for an agent of ending up in state s’ after executing action a in state s is P(s’| s,a). dr. ir. M. Maris, MAS & DI
Goals and planning in a deterministic world • A goal is a desired state of the world, e.g., sGOAL = (4, 3) • Planning is a search through the state space for an optimal path to the goal, using a deterministic transition model. Classical graph search algorithms can be used, e.g., Dijkstra. What about planning in a stochastic world? dr. ir. M. Maris, MAS & DI
Goals and planning in a stochastic world • In a stochastic world, graph search is not possible because state-transitions are non-deterministic. • Note however that an agent tries to achieve a goal. Hence, it can be assigned a preference to be in the goal-state. • Now, we can assign preference values to each state. aka Utility values. dr. ir. M. Maris, MAS & DI
From goals to utilities The utility of a state s is a number U(s) that expresses the desirability of being in state s for a specific agent. If U(s) > U(s’) for an agent, then that agent prefers s to s’. dr. ir. M. Maris, MAS & DI
Decision making in a stochastic world Assume that each state s has utility value U(s), and that the world is stochastic with transition model P(s’| s,a) or P(st+1| st,at). How can the agent choose an optimal action? In any state s the agent must choose the action that maximizes expected utility: a* = arg max P(s’ | s,a) U(s’) a s’ In other words words: to see how good an action is, multiply the utility and the probability for each possible state, and sum it up for all states. dr. ir. M. Maris, MAS & DI
Example Action set: A ={Up, Down, Left, Right}. Goal state: s = (4, 3). With probability 0.8 each action succeeds, but with probability 0.2 the agent moves perpendicularly to the intended direction. Bumping on the wall leaves the position unchanged. What action should the agent take at (3, 1)? dr. ir. M. Maris, MAS & DI
Action values and optimal policies For each state s and each possible action a we can compute an action value or Q-value Q(s, a) = P(s’ | s,a) U(s’) s’ that measures the ‘goodness' of action a in state s for a particular agent. So, for each aA, we compute a Q-value. Hence we can select the action that results in the highest Q-value. We define U*(s) = max Q(s,a) ; for all sS (i.e. for every s we compute max utility) a Given the set of optimal utilities U*(s), the greedy policy *(s) = arg max P(s’ | s,a) U*(s’) a s’ = arg max Q(s,a) a This is called an optimal policy for the agent (select the best action given all possible actions and states). dr. ir. M. Maris, MAS & DI
Recap • What is a memoryless policy? • When a world is called observable? • How do you understand the Markov property? • What does it mean to say that state s has utility 4? • How would you take actions in a stochastic world? • What is an optimal policy? dr. ir. M. Maris, MAS & DI