Cognitive Architecture ICARUS: Mental Simulation and Learning

Mental Simulation and Learning in the ICARUS Architecture Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Thanks to D. Choi, G. Cleveland, A. Danielescu, N. Li, D. and D. Stracuzzi for their contributions. This talk reports research partly funded by a grant from the Office of Naval Research, which is not responsible for its contents.

Cognitive Architectures • the memories that store domain-specific content • the system’s representation and organization of knowledge • the mechanisms that use this knowledge in performance • the processes that learn this knowledge from experience An architecture typically comes with a programming language that eases construction of knowledge-based systems. Research in this area incorporates many ideas from psychology about the nature of human thinking. A cognitive architecture (Newell, 1990) is the infrastructure for an intelligent system that is constant across domains:

The ICARUS Architecture ICARUS (Langley, 2006) is a computational theory of the human cognitive architecture that posits: Short-term memories are distinct from long-term stores Memories contain modular elements cast as symbolic structures Long-term structures are accessed through pattern matching Cognition occurs in retrieval/selection/action cycles Learning involves monotonic addition of elements to memory Learning is incremental and interleaved with performance It shares these assumptions with other cognitive architectures like Soar (Laird et al., 1987) and ACT-R (Anderson, 1993).

Goals for ICARUS • a computational theory of higher-level cognition in humans • that is qualitatively consistent with results from psychology • that exhibits as many distinct cognitive functions as possible Although quantitative fits to specific results are desirable, they can distract from achieving broad theoretical coverage. Our main objectives in developing ICARUS are to produce:

Distinctive Features of ICARUS However, ICARUS also makes assumptions that distinguish it from these architectures: Cognition is grounded in perception and action Categories and skills are separate cognitive entities Short-term elements are instances of long-term structures Skills and concepts are organized in a hierarchical manner Inference and execution are more basic than problem solving Some of these tenets also appear in Bonasso et al.’s (2003) 3T, Freed’s (1998) APEX, and Sun et al.’s (2001) CLARION.

Cascaded Integration in ICARUS Like other unified cognitive architectures, ICARUS incorporates a number of distinct modules. learning problem solving skill execution conceptual inference ICARUS adopts a cascaded approach to integration in which lower-level modules produce results for higher-level ones.

Structure and Use of Conceptual Memory ICARUS organizes conceptual memory in a hierarchical manner. Conceptual inference occurs from the bottom up, starting from percepts to produce high-level beliefs about the current state.

ICARUS Concepts for In-City Driving ((in-rightmost-lane ?self ?clane) :percepts ( (self ?self) (segment ?seg) (line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane) (last-lane ?clane) (not (lane-to-right ?clane ?anylane)))) ((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane) (aligned-with-lane-in-segment ?self ?seg ?lane) (centered-in-lane ?self ?seg ?lane) (steering-wheel-straight ?self))) ((in-lane ?self ?lane) :percepts ( (self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ( (> ?dist -10) (<= ?dist 0)))

Representing Short-Term Beliefs/Goals (current-street me A) (current-segment me g550) (lane-to-right g599 g601) (first-lane g599) (last-lane g599) (last-lane g601) (at-speed-for-u-turn me) (slow-for-right-turn me) (steering-wheel-not-straight me) (centered-in-lane me g550 g599) (in-lane me g599) (in-segment me g550) (on-right-side-in-segment me) (intersection-behind g550 g522) (building-on-left g288) (building-on-left g425) (building-on-left g427) (building-on-left g429) (building-on-left g431) (building-on-left g433) (building-on-right g287) (building-on-right g279) (increasing-direction me) (buildings-on-right g287 g279)

Skill Execution in ICARUS Skill execution occurs from the top down, starting from goals to find applicable paths through the skill hierarchy. This process repeats on each cycle to produce goal-directed but reactive behavior, biased toward continuing initiated skills.

ICARUS Skills for In-City Driving ((in-rightmost-lane ?self ?line) :percepts((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line))) ((driving-well-in-segment ?self ?seg ?line) :percepts((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg) (centered-in-lane ?self ?seg ?line) (aligned-with-lane-in-segment ?self ?seg ?line) (steering-wheel-straight ?self))) ((in-segment ?self ?endsg) :percepts((self ?self speed ?speed) (intersection ?int cross ?cross) (segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions((steer 1)))

Execution and Problem Solving in ICARUS Skill Hierarchy Problem Reactive Execution ? no impasse? Primitive Skills yes Executed plan Problem Solving Problem solving involves means-ends analysis that chains backward over skills and concept definitions, executing skills whenever they become applicable.

Skill Hierarchy ICARUS Learns Skills from Problem Solving Problem Reactive Execution ? no impasse? Primitive Skills yes Executed plan Problem Solving Skill Learning

Learning from Problem Solutions ICARUS incorporates a mechanism for learning new skills that: operates whenever problem solving overcomes an impasse incorporates only information available from the goal stack generalizes beyond the specific objects concerned depends on whether chaining involved skills or concepts supports cumulative learning and within-problem transfer This skill creation process is fully interleaved with means-ends analysis and execution. Learned skills carry out forward execution in the environment rather than backward chaining in the mind.

ICARUS is a unified theory of the cognitive architecture that: ICARUS Summary includes hierarchical memories for concepts and skills; interleaves conceptual inference with reactive execution; resorts to problem solving when it lacks routine skills; learns such skills from successful resolution of impasses. We have developed ICARUS agents for a variety of simulated physical environments, including urban driving. However, it has a number of limitations that we must address to improve its coverage of human intelligence.

Limitations of ICARUS’ Learning Abilities ICARUS provides a plausible account for learning hierarchical skills from successful problem solving. Recent work (Li et al., in press) has adapted this mechanism to learn from worked-out problem solutions by: storing states that arise in each step of the given solution using means-ends analysis to explain why each step occurred acquiring a new skill for each subproblem explained this way However, ICARUS cannot learn from mistakes, such as those that result from unexpected goal interactions.

Goal-Driven Execution: A Recipe for Disaster ICARUS incorporates a goal memory that contains a prioritized set of top-level goals. On each cycle, the architecture notes the most important goal not satisfied by its current beliefs. This goal determines which path through the skill hierarchy ICARUS selects for execution. As a result, the system ignores already satisfied goals while working on this objective. However, unforseen interactions among goals can produce undesirable outomes. • For instance, suddenly changing lanes to avoid a stalled vehicle can lead to collision with another one.

Learning from Goal Violations An extended ICARUS that learns from unforseen events might: Encounter a situation in which pursuing goal A leads it to violate previously satisfied goal B. Use counterfactual reasoning to identify what it could have done differently to avoid the error. Analyze the alternative to acquire a specialzed skill indexed by goals A and B. In future runs, prefer the specialized skill during execution, leading it to avoid the error. Implementing this approach requires three basic extensions to the ICARUS architecture.

An Episodic Belief Memory Before it can analyze the reasons why an error occurred, ICARUS must encode its previous experience. We have introduced an episodic belief memory (Stracuzzi et al., in press) that: retains all beliefs inferred on earlier cognitive cycles; and annotates beliefs with time stamps specifying when they held. These let the architecture reconstruct states that the agent has encountered recently. The current implementation has no mechanisms for forgetting or retrieval, but we plan to add these in the future.

Learning from Counterfactual Reasoning Before it can learn what it should have done differently, ICARUS must identify an alternative behavioral trajectory. We have developed a counterfactual reasoning capability that: works backward from the violated goal to consider the agent’s choices at each step; carries out repeated forward search to find a path that would have avoided the goal violation; and analyzes this path to create a new skill that takes both goals into account. Because analysis starts with the conjoined goal, it produces a new skill with a specialized head and preconditions.

A Trace of Counterfactual Reasoning avoid-obstacles on-left-side lane-aligned-straight crossing-into-left- lane-straight crossing-into -left-lane throttle- special- value crossing- into-left- lane crossing- into-left- lane wheels- straight on-right- side on-left- side failed attempt failed attempt successful attempt crossing- into-right- lane crossing- into-right- lane wheels- straight on-right- side

A Specificity Bias for Skill Execution For ICARUS to benefit from skills learned by its counterfactual reasoning, it must prefer them over ones that caused errors. We have altered the architecture’s execution module to prefer: skills with more specific heads that match top-level goals skills with more specific conditions that match the state These lead ICARUS to mask skills indexed by single goals with ones that handle goal interactions. This in turn lets the system improve its ability to avoid errors in an incremental, cumulative manner.

Related Work on Error-Driven Learning Our approach to learning from execution errors differs from, but bears similarities to: Learning search-control rules by discrimination in SAGE (Langley, 1985) Analytical learning from failure in Soar (Laird et al., 1986) and Prodigy (Minton, 1988) Ohlsson’s (1996) theory of learning from constraint violations Mueller and Dyer’s (1985) model of learning by daydreaming The latter comes closest to our use of counterfactual reasoning, but it was not cast within a unified cognitive architecture.

Research Plans: Reasoning about Others We designed ICARUS to model intelligent behavior in embodied agents, but our work to date has treated them in isolation. Yet people can reason more deeply about the goals and actions of others, then use their inferences to make decisions. The framework can deal with other independent agents, but only by viewing them as other objects in the environment. Adding this capability to ICARUS will require extending its representation, performance processes, and learning methods.

An Extended Representation For ICARUS to reason about other agents’ mental states, it must first represent and store them. We plan to introduce modal predicates like belief, goal, and intention to modify inferences like: • (goal me (in-left-lane me segment16)) • (belief me (goal driver2 (in-right-lane driver2 segment16))) • (belief me (belief driver2 (in-right-lane me segment16))) • (goal me (belief driver2 (goal me (in-left-lane me segment16)))) This scheme eliminates the need for separate goal and belief memories, so a single ‘working meomory’ will suffice. We can also include time stamps with each substructure to indicate its temporal scope.

A Flexible Inference Mechanism The current ICARUS inference process is both deductive and exhaustive, making it implausible and ineffective. The revised architecture will carry out hill climbing through a space of possible worlds (truth assignments on ground literals). Each step will involve changing an existing literal’s truth value or generating an entirely new literal. • ICARUS will guide its inferential choices either by posterior probabilities or by expected values. • The system will also take into account recency of elements matched by consequents or antecedents. This approach is influenced by Polyscheme, Markov logic, and theories of spreading activation.

Default Reasoning and Revisions Given basic inference rules, these changes should let ICARUS make abductive leaps about others’ mental states. The agent’s initial statements about others’ beliefs will be the same as those for the agent. But additional information can lead the system to revise these assumptions nonmonotically when needed. • E.g., we assume that others can see what we see, then alter these beliefs if we note evidence otherwise. This explains why making inferences about others often takes extra time and effort.

Learning to Reason about Others Reasoning about others comes more easily to the experienced than to children and novices. We can explain this with a mechanism that learns inference rules from empirical regularities among beliefs by: • Generating new structures based on co-occurrences of literals in working memory; and • Updating probabilities associated with antecedents and rules based on later co-occurrences. When these specialized rules drive inference, they mask more basic ones, reducing the need for later revisions. This causes more direct inferences about others’ mental states, thus reaching conclusions with less time and effort.

Summary of Planned Research To provide ICARUS with the capability to reason about others’ mental states, we plan to: • Extend its representation to support embedded modal literals; • Alter inference to hill climb through possible worlds guided by recencies and probabilities; • Combine default reasoning about others with nonmonotonic revision when appropriate; and • Acquire specialized inference rules from experience to reduce the need for such belief revision. We will implement these extensions to ICARUS and test them in urban driving and other settings.

End of Presentation

Cognitive Architecture ICARUS: Mental Simulation and Learning

Cognitive Architecture ICARUS: Mental Simulation and Learning

Presentation Transcript

Arizona State University

Redesigning Computer Literacy Arizona State University Tempe,Arizona

School of Electrical, Computer and Energy Engineering Arizona State University Tempe

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

School of Electrical, Computer and Energy Engineering Arizona State University Tempe

Sandeep K. S. Gupta School of Computing and Informatics Arizona State University

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA

Arizona State University

Arizona State University

Arizona State University

Pat Langley Computer Science and Engineering Arizona State University Tempe, Arizona USA

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona

Quality Assurance W. P. Carey School of Business Arizona State University Tempe, Arizona

Sandeep K. S. Gupta School of Computing and Informatics Arizona State University

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA

Karl Booksh School of Biochemistry Arizona State University (Tempe) Denise Wilson

Arizona State University

School of Electrical, Computer and Energy Engineering Arizona State University Tempe