280 likes | 441 Views
Learning through Interactive Behavior Specifications. Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan. Goal. Automatically generate cognitive agents Reduce the cost of agent development
E N D
Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan
Goal • Automatically generate cognitive agents • Reduce the cost of agent development • Reduce the expertise required to develop agents.
Domains • Autonomous Cognitive agents • Dynamic Virtual Worlds • Real time decisions based on knowledge and sensed data • Soar agent architecture
Learning by Observation • Approach: • Observe expert behavior • Learn to replicate it • Why? • We may want human-like agents • In complex domains, imitating humans maybe easier than learning from scratch
Bottleneck in pure Learning by Observation • PROBLEM: • You cannot observe the internal reasoning of the expert • SOLUTION: • Ask the expert for additional information • Goal annotations • Use additional knowledge sources • Task & domain knowledge
Interface Expert Agent Learning by Observation Environment Goal annotations Actions Percepts Additional Task Knowledge Learner
Interface ILP 2004 Machine Learning Journal (forthcoming) Agent Learning by Observation Environment
Interface Expert Agent Learning by ObservationCritic Mode Environment critic Learner
Interface Expert Agent One Body, Two Minds ? ? Environment • How and when to switch control • How the expert and the agent program communicate
Expert Environment Redux Agent Diagrammatic Behavior Specification Learner
Redux • Visual rule editing • Diagrammatic Behavior Specification
i3 i3 i3 i3 Get-item-in-room(Item) Goto-next-room i4 r3 r2 d3 d4 d2 Go-to-door(D) Go-to(Door) Go-through(Door) d1 d5 d6 r1 r4 Goal Hierarchy Get-item(Item) Get-item-different-room(Item) • Task-Performance knowledge is represented with a hierarchy of durative goals.
i3 i3 i4 r3 r2 d3 d4 d2 Go-to-door(D) Go-through(Door) d1 d5 d6 r1 r4 Goal Hierarchy Get-item(i3) Item=i3 Get-item-in-room(Item) Get-item-in-room(i3) Goto-next-room Get-item-different-room(Item) Go-to(Door)
i3 i3 i4 r3 r2 d3 d4 d2 Go-through(Door) d1 d5 d6 r1 r4 Goal Hierarchy Get-item(i3) Item=i3 Get-item-in-room(Item) Get-item-different-room(i3) Get-item-different-room(Item) Door=d1 Go-to(Door) Go-to(d1)
i3 Goto-next-room i4 r3 r2 d3 d4 d2 Go-to(Door) Go-to-door(D) Go-through(d1) d1 d5 d6 r1 r4 Goal Hierarchy i3 Get-item(i3) Get-item-in-room(Item) Get-item-different-room(i3) Door=d1
Expert Agent Behavior Specification • Expert draws initial abstract situation • Create senario by selecting actions
Expert Agent Goal Specification • Goals are explicitly selected • The agent contributes based on the current situation, current goal and its knowledge
Goal Hierarchy • Learning by Observation perspective • Unobservable mental reasoning of the expert • Learning Perspective • Bias hypothesis space • “learn agent” problem reduced to “learn goal selection and termination” • MI Perspective • information exchange between the expert and the agent
Expert Agent Relevant Knowledge Specification Prepare food • Expert can mark important objects in a decision
Rich Behavior Trace • Expert specified undesired actions and goals • Expert rejected actions and goals of the approximately learned agent program Watch TV
Rich Behavior Trace • Hypothetical Actions and Goals • Situation history : a tree structure of possible behaviors
Relational Learning by Observation • Input: • Relational Situations • Goal and action selections and rejections • Additional annotations (i.e. important objects) • Background knowledge • Output: • Rule based agent program • Learn goal/action selection/termination • generalizing over multiple examples • Inductive Logic Programming to combine rich knowledge structures
Relational Learning by Observation Find the common structures in the decision examples
Relational Learning by Observation ? Learn relations between what the agent wants, perceives and knows. “Select a door in the current room, which leads to a room that contains the item the agent wants to get”
Summary Diagrammatic behavior specification approach: • To extract rich behavior knowledge • Interactive behavior specification • Communication medium between the agents (explicit goals and assumed situation) • Relational learning by observation approach to combine multiple complex knowledge sources
Future Work • Improve mixed initiative interaction of the interface • Explore domain independent diagrammatic interface features • Allow the expert to enter context sensitive knowledge
Mixed initiative perspective • Interactive behavior specification • Diagrammatic representation of behavior • communication medium between the agents • Explicit goals and desired behavior • Facilitates interaction between the agents