Architectural Patterns for Agents

Architectural Patterns for Agents Jacques Robin

Agents’ Internal Architectures • Reflex agent (purely reactive) • Automata agent (reactive with state) • Goal-based agent • Planning agent • Hybrid, reflex-planning agent • Utility-based agent (decision-theoretic) • Layered agent • Adaptive agent (learning agent) • Cognitive agent • Deliberative agent

Reflex Agent Environment Sensors Rules Percepts  Action A(t) = h(P(t)) Effectors

Agent P Reasoning Action Choice:A = g(I,O) A Remember … Environment Percept Interpretation: I = f(P) Sensors Goals Effectors

Environment Percept Interpretation: I = f(P) P Sensors Rules Percepts  Action A(t) = h(P(t)) Goals A Action Choice:A = g(I,O) Effectors So?

Reflex Agent • Principle: • Use rules (or functions, procedures) that associate directly percepts to actions • ex.IF speed > 60 THEN fine • ex.IF front car’s stop light switches on THEN brake • Execute first rule which left hand side matches the current percepts • Wumpus World example • IF visualPerception = glitter THEN action = pick • see(glitter)  do(pick) (logical representation) • Pros: • Condition-action rules is a clear, modular, efficient representation • Cons: • Lack of memory prevents use in partially observable, sequential, or non-episodic environments • ex, in the Wumpus World a reflex agent can’t remember which path it has followed, when to go out of the cavern, where exactly are located the dangerous caverns, etc.

Automata Agent Environment Percept Interpretation Rules: percepts(t)  model(t)  model’(t) Sensors (Past and) Current Enviroment Model Model Update Regras: model(t-1)  model(t) model’(t)  model’’(t) Goals Action Choice Rules: model’’(t)  action(t), action(t)  model’’(t)  model(t+1) Effectors

Automata Agent • Rules associate actions to percept indirectly through the incremental construction of an environment model (internal state of the agent) • Action choice based on: • current percepts + previous percepts + previous actions + encapsulated knowledge of initial environment state • Overcome reflex agent limitations with partially observable, sequential and non-episodic environments • Can integrate past and present perception to build rich representation from partial observations • Can distinguish between distinct environment states that are indistinguishable by instantaneous sensor signals • Limitations: • No explicit representation of the agents’ preferred environment states • For agents that must change goals many times to perform well, automata architecture is not scalable (combinatorial explosion of rules)

Automata Agent Rule Examples • Rules percept(t) model(t)  model’(t) • IF visualPercept at time T is glitterAND location of agent at time T is (X,Y)THEN location of gold at time T is (X,Y) • X,Y,T see(glitter,T) loc(agent,X,Y,T)loc(gold,X,Y,T). • Rules model’(t) model’’(t) • IF agent is with gold at time TAND location of agent at time T is (X,Y)THEN location of gold at time T is (X,Y) • X,Y,T withGold(T)  loc(agent,X,Y,T)loc(gold,X,Y,T).

Automata Agent Rule Examples • Rules model(t)  action(t) • IF location of agent at time T = (X,Y) AND location of gold at time T = (X,Y) THEN choose action pick at time T • X,Y,T loc(agent,X,Y,T)  loc(gold,X,Y,T)  do(pick,T) • Rules action(t)  model(t)  model(t+1) • IF choosen action at time T was pick THEN agent is with gold at time T+1 • T done(pick,T)  withGold(T+1).

(Explicit) Goal-Based Agent Environment Percept Interpretation Rules: percept(t)  model(t)  model’(t) Sensors (Past and) CurrentEnvironment Model Model Update Rules:model(t-1)  model(t) model’(t)  model’’(t) Goal Update Rules:model’’(t)  goals(t-1)  goals’(t) Goals Action Choice Rules: model’’(t)  goals’(t)  action(t) action(t)  model’’(t)  model(t+1) Effectors

(Explicit) Goal-Based Agent • Principle: explicit and dynamically alterable goals • Pros: • More flexible and autonomous than automata agent • Adapt its strategy to situation patterns summarized in its goals • Limitations: • When current goal unreachable as the effect of a single action, unable to plan sequence of actions • Does not make long term plans • Does not handle multiple, potentially conflicting active goals

Goal-Based Agent Rule Examples • Rule model(t)  goal(t) action(t) • IF goal of agent at time T is to return to (1,1) AND agent is in (X,Y) at time T AND orientation of agent is 90o at time T AND (X,Y+1) is safe at time T AND (X,Y+1) has not being visited until time T AND (X-1,Y) is safe at time T AND (X-1,Y) was visited before time T THEN choose action turn left at time T • X,Y,T, (N,M,K goal(T,loc(agent,1,1,T+N)) loc(agent,X,Y,T)  orientation(agent,90,T)  safe(loc(X,Y+1),T) loc(agent,X,Y+1,T-M)  safe(loc(X-1,Y),T)  loc(agent,X,Y+1,T-K)) do(turn(left),T)

Goal-Based Agent Rule Examples • Rule model(t)  goal(t)  action(t) • IF goal of agent at time T is to find gold AND agent is in (X,Y) at time T AND orientation of agent is 90o at time T AND (X,Y+1) is safe at time T AND (X,Y+1) has not being visited until time T AND (X-1,Y) is safe at time T AND (X-1,Y) was visited before time T THEN choose action forward at time T • X,Y,T, (N,M,K goal(T,withGold(T+N)) loc(agent,X,Y,T) orientation(agent,90,T)  safe(loc(X,Y+1),T)  loc(agent,X,Y+1,T-M)  safe(loc(X-1,Y),T)  loc(agent,X,Y+1,T-K)) do(forward,T)

Goal-Based Agent Rule Examples • Rule model(t)  goal(t) goal’(t) //If the agent reached it goal to hold the gold, //then its new goal shall be to go back to (1,1) • IF goal of agent at time T-1 was to find gold AND agent is with gold at time T THEN goal of agent at time T+1 is to be in location (1,1) • T, (N goal(agent,T-1,withGold(T+N))  withGold(T)M goal(agent,T,loc(agent,1,1,T+M))).

Planning Agent Environment (Past and)Current Environment Model Percept Interpretation Rules: percept(t)  model(t)  model’(t) Sensors Model Update Rules:model(t-1)  model(t) model’(t)  model’’(t) Goal Update Rules:model’’(t)  goals(t-1)  goals’(t) Goals Prediction of Future Environments Rules: model’’(t)  model(t+n) model’’(t)  action(t)  model(t+1) Hypothetical Future Environment Models Action Choice Rules: model(t+n) = result([action1(t),...,actionN(t+n)] model(t+n) goal(t)  do(action1(t)) Effectors

Planning Agent • Percept and actions associated very indirectly through: • Past and current environment model • Past and current explicit goals • Prediction of future environments resulting from different possible action sequences to execute • Rule chaining needed to build action sequence from rules capture immediate consequences of a single action • Pros: • Foresight allows choosing more relevant and safer actions in sequential environments • Cons: little point in building elaborated long term plans in, • Highly non-deterministic environment (too many possibilities to consider) • Largely non-observable environments (not enough knowledge available before acting) • Asynchronous concurrent environment (only cheap reasoning can reach a conclusion under time pressure)

Synchronization Hybrid Reflex-Planning Agent Environment Reflex Thread Reflex Rules Percepts Actions Sensors Planning Thread Current, past and future environment model Percept Interpretation Current Model Update Future Environments Prediction Effectors Goal Update Goals Action Choice

Hybrid Reflex-Planning Agent • Pros: • Take advantage of all the time and knowledge available to choose best possible action (within the limits of its prior knowledge and percepts) • Sophisticated yet robust • Cons: • Costly to develop • Same knowledge encoded in different forms in each component • Global behavior coherence harder to guarantee • Analysis and debugging hard due to synchronization issues • Not that many environments feature large variations in available reasoning time in different perception-reasoning-action cycles

Layered Agents • Many sensors/effectors are too fine-grained to reason about goals using directly the data/commands they provide • Such cases require a layered agent that decomposes its reasoning in multiple abstraction layers • Each layer represent the percepts, environment model, goals, and actions at a different level of details • Abstraction can consist in: • Discretizing, approximating, clustering, classifying data from prior layers along temporal, spatial, functional, social dimensions • Detail can consist in: • Decomposing higher-level actions into lower-level ones along temporal, spatial, functional, social dimensions Decide Abstractly Abstract Detail Perceive in Detail Act in Detail

Ambiente Percept Interpretation Layer2: Layer1: Sensors Layer0: Environment Model Environment Model Update Layer2: Layer2: Action Choice and Execution Control Layer2: Layer1: Effectors Layer0: Layered Automata Agent

Y X Exemplo de camadas de abstração:

Y X Abstraction Layer Examples

Utility-Based Agent • Principle: • Goals only express boolean agent preferences among environment states • A utility function u allows expressing finer grained agent preferences • u can be defined on a variety of domains and ranges: • actions, i.e., u: action  R (or [0,1]), • action sequences, i.e., u: [action1, ..., actionN] R (or [0,1]), • environment states, i.e., u: environmentStateModel  R (or [0,1]), • environment state sequences, i.e., u: [state1, ..., stateN]  R (or [0,1]), • environment state, action pairs, i.e., u: environmentStateModel x action  R (or [0,1]), • environment state, action pair sequences, i.e., u: [(action1-state1), ..., (actionN-stateN)] R (or [0,1]), • Pros: • Allows solving optimization problems aiming to find the best solution • Allows trading-off among multiple conflicting goals with distinct probabilities of being reached • Cons: • Currently available methods to compute (even approximately) argmax(u) do not scale up to large or diverse environments

Environment Percept Interpretation: Rules: percept  actions Sensors Goals Effectors Utility-Based Reflex Agent Action Choice: Utility Function u:actions  R

Utility-Based Planning Agent Environment Percept Interpretation Regras: percept(t)  model(t)  modelo’(t) Sensors Past & Current Environment Model Model Update Regras:model’(t)  model’’(t) Future Environment Prediction Regras: model’’(t)  ação(t)  model(t+1) model’’(t)  model(t+1) Hypothesized Future Environments Model Utility Function:u: model(t+n)  R Action Choice Effectors

Performance Analysis Component Learning Component Adaptive Agent Environment Sensors Acting Component • Learn rules or functions: • percept(t)  action(t) • percept(t)  model(t)  modelo’(t) • modelo(t)  modelo’(t) • modelo(t-1)  modelo(t) • modelo(t)  action(t) • action(t)  model(t+1) • model(t)  goal(t)  action(t) • goal(t)  model(t)  goal’(t) • utility(action) = value • utility(model) = value • Reflex • Automata • Goal-Based • Planning • Utility-Based • Hybrid New Problem Generation Component Effectors

Simulated Environments • Environment simulator: • Often themselves internally follow an agent architecture • Should be able to simulate a large class of environments that can be specialized by setting many configurable parameters either manually or randomly within a manually selected range • ex, configure a generic Wumpus World simulator to generate world instances with a square shaped cavern, a static wumpus and a single gold nugget where the cavern size, pit numbers and locations, wumpus and gold locations are randomly picked • Environment simulator processing cycle: • Compute percept of each agent in current environment • Send these percepts to the corresponding agents • Receives the action chosen by each agent • Update the environment to reflect the cumulative consequences of all these actions

actions Agent Client 1 Agent Client N percepts Environment Simulator Architecture Simulation Visualization GUI Rede Environment Update Rules: model(t-1)  model(t) action(t)  model(t-1)  model(t) Simulated Environment Model Environment Simulation Server ... Percept Generation Rules: model(t) percept(t)

A KBA is a: Reflex Agent? Automata Agent? Goal-Based Agent? Planning Agent? Hybrid Agent? Utility-Based Agent? Adaptive Agent? Layered Agent? Can be anyone ! Is there any constraint between the reasoning performed by the inference engine and the agent architecture? Adaptive agent requires analogical or inductive inference engine KBA Architectures

Non-Adaptive KBA Environment Persistent Knowledge Base (PKB): rules, classes, logical formulas or probabilities representing generic laws about environment class Sensors Ask Inference Engine for Deduction, Abduction, Inheritance, Belief Revision, Belief Update, Planning, Constraint Solving or Optimization Non-Monotonic Engine Ask Tell Retract Volatile Knowledge Base (VKB): facts, objects, constraints, logical formulas or probabilities representing environment instance in current agent execution Effectors

Analogical KBA Environment Persistent Knowledge Base (PKB): facts, objects, constraints, logical formulas or probabilities representing environment instances in past agent executions structured by similarity measure Sensors Ask Inference Engine for Analogy Tell Ask Retract Volatile Knowledge Base (VKB): facts, objects, constraints, logical formulas or probabilities representing environment instance in current agent execution Effectors

Remember the Planning Agent? Environment (Past and)Current Environment Model Percept Interpretation Rules: percept(t)  model(t)  model’(t) Sensors Model Update Rules:model(t-1)  model(t) model’(t)  model’’(t) Goal Update Rules:model’’(t)  goals(t-1)  goals’(t) Goals Prediction of Future Environments Rules: model’’(t)  model(t+n) model’’(t)  action(t)  model(t+1) Hypothetical Future Environment Models Action Choice Rules: model(t+n) = result([action1(t),...,actionN(t+n)] model(t+n) goal(t)  do(action(t)) Effectors

Environment Sensors PKB: PerceptInterpretation VKB: Past and Current Environment Models PKB: Environment Model Update PKB: Goals Update Inference Engine VKB: Goals VKB: Hypothetical Future Environment Models PKB: Prediction of Future Environments PKB: Acting Strategy Effectors How would be then aknowledge-based planning agent?

Alternative Planning KBA Architecture Environment PKB: PerceptInterpretation Inference Engine 1 Sensors VKB: Past and Current Environment Models PKB: Environment Model Update Inference Engine 2 PKB: Goals Update Inference Engine 3 VKB: Goals VKB: Hypothetical Future Environment Models PKB: Prediction of Future Environments Inference Engine 4 PKB: Acting Strategy Inference Engine 5 Effectors

Why Using Multiple Inference Engines? Environment PKB: PerceptInterpretation Abduction Inference Engine 1 Sensors VKB: Past and Current Environment Models PKB: Environment Model Update Belief Update Inference Engine 2 PKB: Goals Update Inference Engine 3 Deduction VKB: Goals VKB: Hypothetical Future Environment Models PKB: Prediction of Future Environments Inference Engine 4 Constraint Solving PKB: Acting Strategy Optimization Inference Engine 5 Effectors

Off-Line Inductive Agent:Training Phase Ask InductiveInference Engine Intentional Knowledge Base (IKB): rules, classes or logical formulas representing generic laws about environment class Tell Hypothesis Formation Retract Ask Hypothesis Verification Data, Examples or Case Basefacts, objects, constraints or logical formulas codifying representative sample of environment entities Ask Performance Inference Engine:Any Reasoning Task Except Analogy and Induction Tell Retract Ask

Off-Line Inductive Agent: Usage Phase Inductively Learned Persistent Knowledge Base (PKB): rules, classes, logical formulas or probabilities representing generic laws about environment class Environment Sensors Ask Inference Engine for Deduction, Abduction, Inheritance, Belief Revision, Belief Update, Planning, Constraint Solving or Optimization Ask Tell Retract Volatile Knowledge Base (VKB): facts, objects, constraints, logical formulas or probabilities representing environment instance in current agent execution Effectors

Architectural Patterns for Agents