1 / 17

Learning in Multiagent systems

Learning in Multiagent systems. Prepared by: Jarosław Szymczak Based on: „ Fundamentals of Multiagent Systems with NetLogo Examples” by Jos é M Vidal. Scenarios of learning. cooperative learning – e . g . each agent has own map, and together with other agents aggregate a global view

beck-patel
Download Presentation

Learning in Multiagent systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning inMultiagent systems Prepared by: Jarosław Szymczak Based on: „Fundamentals of Multiagent Systems with NetLogo Examples” by José M Vidal

  2. Scenarios of learning • cooperative learning – e.g. each agent has own map, and together with other agents aggregate a global view • competitive learning – e.g. each selfish agent tries to maximize own utility by learning about behaviors and weaknesses of other agents • agents learn because: • they don’t know everything about the environment • they don’t know how the other agents behave

  3. The Machine Learning Problem • The goal of machine learning research is the development of algorithms that increase the ability of an agent to match a set of inputs to their corresponding outputs (Mitchell, 1997) • Input here could be e.g. a set of photos depicting people and output would be set consisting of {man, woman}, the machine learning algorithm will have to learn a proper recognition of photos

  4. The Machine Learning Problem • Input set is usually divided into training and testing set, they can be interleaved • Graphical representation of MLP:

  5. The Machine Learning Problem • Induction bias – some learning algorithms appear to perform better than others in certain domains (e.g. two algorithms can learn perfectly classify + and -, but still have different functions) • No free lunch theorem - averages over all possible learning problems there is no learning algorithm that outperforms all others • In multiagent scenario some of the fundamental assumptions of machine learning are violated, there is no longer fixed input, it keeps changing because other agents are also learning

  6. Cooperative learning • We have given a two robots able to communicate, they can share their knowledge (their capabilities, knowledge about terrain etc.) • Sharing the information is really easy if robots are identical, if not we need to somehow model their capabilities to decide which information would be useful for another robot • Most systems that share learned knowledge among agents, such as (Stone, 2000), simply assume that all agents have the same capabilities.

  7. Repeated games • Nash equilibrium – I choose what is best for me when you are doing what you are doing and you choose what is best for you when I am doing what I am doing • In repeated games we have two players facing each other, like e.g. prisoner's dilemma • Nash equilibrium is based on assumption of perfectly rational players, in learning in games the assumption is that agents use some kind of algorithm, the theory determines the equilibrium strategy that will be arrived at by the various learning mechanisms and maps these equilibrium to the standard solution concepts, if possible.

  8. Fictitious play Agent remembers everything the other agents have done. E.g.:

  9. Fictitious play • Let’s have a look at some theorems: • (Nash Equilibrium is Attractor to Fictitious Play). If s is a strict Nash equilibrium and it is played at time t then it will be played at all times greater than t (Fudenberg and Kreps, 1990). • (Fictitious Play Converges to Nash). If fictitious play converges to a pure strategy then that strategy must be a Nash equilibrium (Fudenberg and Kreps, 1990).

  10. Fictitious play • Infinite cycles problem – we can avoid it by using randomness, here is the example of infinite cycle in fictitious play:

  11. Replicator dynamics This model assumes that the fraction of agents playing a particular strategy replicator dynamics will grow in proportion to how well that strategy performs in the population. A homogeneous population of agents is assumed. The agents are randomly paired in order to play a symmetric game (same strategies and payoffs). It is inspired by biological evolution. Let φt(s) be a number of agents using strategy s at time t, ut(s) be an expected utility for an agent playing strategy s at time t and u(s,s’) be utility that agant playing s receives against agent playing s’. We can define:

  12. Replicator dynamics • Let’s have a look at some theorems: • (Nash equilibrium is a Steady State). Every Nash equilibrium is asteady state for the replicator dynamics (Fudenberg and Levine, 1998). • (Stable Steady State is a Nash Equilibrium). A stable steady stateof the replicator dynamics is a Nash equilibrium. A stable steady state is one that,after suffering from a small perturbation, is pushed back to the same steady state bythe system’s dynamics (Fudenberg and Levine, 1998). • (Asymptotically Stable is Trembling-Hand Nash). An asymptoticallystable steady state corresponds to a Nash equilibrium that is trembling-hand perfectand isolated. That is, the stable steady states are a refinement on Nash equilibria - only a few Nash equilibria are stable steady states (Bomze, 1986).

  13. Evolutionary stable strategy • An ESS is an equilibrium strategy that can overcome the presence of a small number of invaders. That is, if the equilibrium strategy profile is ω and small number ε of invaders start playing ω’then ESS states that the existing population should get a higher payoff against the new mixture (εω‘+(1−ε)ω) than the invaders. • (ESS is Steady State of Replicator Dynamics). ESS is an asymptoticallystable steady state of the replicator dynamics. However, the converse need notbe true—a stable state in the replicator dynamics does not need to be an ESS (Taylorand Jonker, 1978).

  14. Replicator dynamics

  15. AWESOME algorithm The abbreviation stands for: Adapt When Every is Stationary, Otherwise Move to Equilibrium

  16. Stochastic games COMING SOON  (THIS AUTUMN)

More Related