Lecture about Agents that Learn

Lecture about Agents that Learn • 3rd April 2000 • INT4/2I1235

Agenda • Introduction • Centralized learning vs decentralized learning • Credit Assignment Problem • Learning and Activity Coordination • Learning about and from other agents • Learning and Communication • Summary

Introduction • Todays topic • Who is the lecturer • Why do we have this lecture

Todays topic • How do agents learn? • What are the benefits of learning agents? • Learning in isolation, or in cooperation?

Who is the lecturer • Johan Kummeneje • Doctoral Student • RoboCup, Social Decisions, and Java

Why do we have this lecture • Beats me….. You tell me. • Take 2 minutes to think about why this is interesting, and then I will ask 2 or 3 of you what you think.

Centralized vs Decentralized • Introduction • The Degree of Decentralization • Interaction-specific features • Involvement-specific features • Goal-specific features • The learning method • The learning feedback

Introduction • Learning process => planning, inference, decision steps etc. • Centralized learning or isolated learning • Decentralized learning or interactive learning

The Degree of Decentralization • Distributedness • Paralellism

Interaction-specific features • Level of interaction ( ”simple” observation to complex negotiations and dialogues) • Persitence of interaction (short-long) • Frequency (low -high) • Pattern ( unstructured- hierarchical) • Variability (fixed - dynamic)

Involvement-specific features • Relevance to the learning process • Role in the learning process • Generalist-- Specialist

Goal-specific features • Improvement (Individual vs Social) • Conflict vs Compatible Goals

The learning method • Rote learning (”Korvstoppning”) • Instructed and adviced • Examples and practice (Learning by Doing, Baden-Powell) • Analogy • Discovery Efforts increase from top to bottom.

The learning feedback • Supervised (tells which action that is the best) • Reinforcement (maximizing the utility of action) • Unsupervised (no explicit feedback)

Credit Assignment Problem • Inter Agent CAP (how to divide credit to the different agents) • Intra Agent CAP (how to divide credit between different actions performed in an agent)

Learning and Activity Coordination • Introduction • Reinforcement Learning • Q-Learning and Learning Classifier Systems • Isolated, Concurrent Reinforcement Learners • Interactive Reinforcement Learning of Coordination • ACE and AGE

Introduction • Activity Coordination • Adaption to to differences in the coordination process • Effectively utilize opportunities and avoidance of pitfalls.

Reinforcement Learning • Optimise the feedback (reinforcement) • Modeled by a Markov decision process • <S, A, SxSxA,r>

Q-Learning • When getting feedback=> update the Q-value • Q(s,a) <- (1-b)Q(s,a)+b(R+y max(Q(s',a')) • where b is a small constant called the learning rate

Learning Classifier Systems • A classifier is (condition, action) • Strength of the classifier at a time S(c,a) • At each timestep a classifier is choosen from a matchset ( according to environment) • Feedback is received and the S is modified accordingly.

Isolated, Concurrent Reinforcement Learners • Agent Coupling • Agent relationships • Feedback timing • Optimal behaviour combinations • CIRL • No modelling of other agents • In cooperative situations, complimentary policies can be developed • Adapts to similar situations.

Interactive Reinforcement Learning of Coordination • Eliminates incompatible actions • Agents can observe the set of considered actions of other agents. • Two different alternatives are ACE and AGE

Action Estimate Algorithm (ACE) • Each agent calculates the set of performable actions • For each of these the agent calculates the goalrelevance. • For all agent with a GR above a treshold, the agents calc. And announces a bid with a risk factor and a noise term : • B(S)= (a+b)E(S) • Removal of incompatible actions. It thereafter executes the one with the highest bid. • The feedback increases the probability for succesful actions to be performed in future.

Action Group Estimate Algorithm (AGE) • All applicable actions from each agent is collected in to all possible activity contexts, in which all actions are mutually compatible. • Using the same bidding strategy from ACE, the highest sum of bids for a activity context, chooses the activity context to execute. • Credit assignment is dependent on the actions performed and the relevance of the action. • Requires more computational effort than ACE.

Learning about and from other agents • Introduction • Learning Organizational Roles • Learning in Market Environments

Introduction • Learning to improve the individual performance • On the expense of other agents • Anticipatory Agents, RMM

Learning Organizational Roles • Learns roles, to better complement each other. • Each agent can be in a set of roles (one at a time), and the choice is to choose the most appropriate role. (Minimise costs). • f(U, P, C, Potential)

Learning in Market Environments • Agents sell/buy information from each other. • 0-level agents do not model other agents • 1-level agents model other agents as 0-level agents • 2-level agents model other agents as 1-level agents

Learning and Communication • Introduction • Reducing Communication by Learning • Improving Learning by Communication

Introduction • Learning to communicate • Communicating as learning • What to communicate? • When to communicate? • With whom to communicate? • How to communicate?

Reducing Communication by Learning • Learning about the abilities of other agents. • Learning which agents to ask, instead of broadcasting • Problem similarities

Improving Learning by Communication • Communicating beliefs and pieces of information • Explanation • Ontologies • Finding out complex relationships between different agents and actions.

Summary • We have seen the move of foci from isolated (individual, centralized) learning to a more diverse flora of learning. • Besides standard (old) ML-methods there are some new ML-algorithms proposed. • Agents learn to improve communication and cooperation.

Further reading • Peter Stone, Ph.D-thesis • Weiss (coursematerial), chapter 6 • Russell and Norvig, AI. A modern Approach

THE END

Lecture about Agents that Learn

Lecture about Agents that Learn

Presentation Transcript

Agents that Search

Agents that Reason Logically

Learn about

Schools that Learn

Systems That Learn

Agents that Reason Logically

Systems That Learn

The Semantic Web Week 20: Agents that can plan and learn..

Agents that reason logically

Lecture 01 Intelligent Agents

Lecture 13 Mobile Agents

Systems That Learn

LECTURE 2: INTELLIGENT AGENTS

Incorporating Advice into Agents that Learn from Reinforcement

Agents that Reason Logically

Agents that use logic

Intelligent Agents - Lecture 3

Learn about the Best immigration consultants agents services in Chandigarh

All That You Should Learn About Marriage ceremonies

Lecture 2: Intelligent Agents

Lecture 4 – Bond Agents

Agents that plan