410 likes | 512 Views
SOAR. CIS 479/579 Bruce R. Maxim UM-Dearborn. Behavior Modeling. Production rules can be used to model problem-solving behavior Short term memory holds WM assertions Long term memory holds production rules
E N D
SOAR CIS 479/579 Bruce R. Maxim UM-Dearborn
Behavior Modeling • Production rules can be used to model problem-solving behavior • Short term memory holds WM assertions • Long term memory holds production rules • Derived from protocol analysis (e.g. studying transcripts of recorded “think out loud” problem solving sessions)
Behavior Modeling • State of knowledge • what the subject knows • Problem behavior graph • trace of subject moving through states of knowledge • Problem solving • search through network of problem states
SOAR • Starts with an initial situation and moves toward the goal state • Establishes a preference net to determine rankings of various choices • absolute preferences connect states to acceptable, rejected, best, and worst nodes • relative preferences connect states to better, worse, and indifferent nodes
SOAR • Preference labels and links are translated into dominance relations among states • Dominance relations used to select next current state • If goal state not found, process begins again • SOAR does no conflict resolution, it fires all applicable rules • Rules generally focus on replacing the current operator being used to discover states
SOAR Algorithm To determine the preferred state using the SOAR automatic preference analyzer • Collect all states that are labeled acceptable • Discard all acceptable states that are also labeled rejected • Determine the dominance relations as follows: • State A dominates state B if there is a better link from A to B but no better link from B to A • State A dominates state B if there is a worse link from B to A but no worse link from A to B • A state labeled best, and not dominated by any other state, dominates all other states • A state labeled worst, which does not dominate any other state, is dominated by all other states
SOAR Algorithm • Discard all dominated states • Select the next state from among those remaining as follows: • If only one other state remains, select it. • Otherwise, if no states remain, select the current state, unless it is marked rejected • Otherwise, if all states are connected by indifferent links, • If the current state is one of the remaining states, keep it. • Otherwise, select a state at random • Otherwise, announce an impasse
Hello World Rule ################################################# # From Chapter 2 of Soar 8 Tutorial 2 ### This rule writes "Hello World" and halts. sp {hello-world (state <s> ^type state) --> (write |Hello World|) (halt) }
SOAR Syntax • sp = soar production • Body of rule delimited by { } • Rule name is hello-world • --> separates the if and then parts of a rule statement • Conditions and actions delimited by ( ) • Identifiers = a..z, 0..9, #, -, * • Variables = < identifier >
Working Memory • Contains the dynamic information for SOAR entities • Sensor data • Calculations • Current operators • Goals • Collections of WM elements describing the same entity are called objects
How does hello-world work? • The first condition always starts with a state • <s> is a variable capable of matching everything • So the interpretation of (state <s> ^type state) is “if I exist” • Every entity has a state attribute so the actions are carried out
SOAR Operator • Can do things in either the real world or in the mind of the SOAR agent • Operators have two basic parts • Proposal rule used to determine when an operator can be applied • Application rule that actually performs the work for the operator
Operator Proposal Rule ################################################# # From Chapter 3 of Soar 8 Tutorial 2 # ### This operator writes "Hello World" and halts. sp {propose*hello-world (state <s> ^type state) --> (<s> ^operator <o> +) (<o> ^name hello-world) }
Syntax (<s> ^operator <o> +) • This action creates a preference for the new operator in WM • The + marks it as an acceptable preference (<o> ^name hello-world) • This action creates a working memory element that holds the name of the operator • Variable scope is limited to a single rule
Operator Application Rule ################################################# # From Chapter 3 of Soar 8 Tutorial # ### This operator writes "Hello World" and halts. sp {apply*hello-world (state <s> ^operator <o>) (<o> ^name hello-world) --> (write |Hello World|) (halt) }
English Versions • Propose *hello-world If I exist propose hello-world operator • Apply hello-world If hello-world operator is selected Then write “Hello World” and stop
SOAR Execution Cycle • Propose operators (rules) • Select operations (decision procedure) • Apply operator (rules) • Go to step 1
Eater Move-North ############################ Move-north operator # From Chapter 2 of Soar 8 Tutorial 2 # # Propose*move-north: # If I exist, then propose the move-north # operator. sp {propose*move-north (state <s> ^type state) --> (<s> ^operator <o> +) (<o> ^name move-north)}
Eater Move-North # Apply*move-north: # If the move-north operator is selected, then # generate an output command to # move north. sp {apply*move-north (state <s> ^operator <o> ^io <io>) (<io> ^output-link <ol>) (<o> ^name move-north) --> (<ol> ^move <move>) (<move> ^direction north)}
Syntax • ^io <io> • This adds a requirement for a match on the output before firing the rule • It is getting the output link <ol> that will allow Eater movement (but not continuous movement)
Fixing Eater Move-North ############################ Move-north operator # From Chapter 3 of Soar 8 Tutorial 2 # Corrected so operator applies more than once. # Propose*move-north: # If I am at some location, then propose the # move-north operator. sp {propose*move-north (state <s> ^io.input-link.eater <e>) (<e> ^x <x> ^y <y>) --> (<s> ^operator <o> +) (<o> ^name move-north) }
Explanation • The key change is eliminating the test for state and focus on non-persistent object attributes • This sets the stage for a second operator instance selected • We also need to create a second apply rule that removes the previous move command from WM
Application Rule 1 # Apply*move-north: # If the move-north operator is selected, then # generate an output command to # move north. sp {apply*move-north (state <s> ^operator.name move-north ^io.output-link <ol>) --> (<ol> ^move.direction north)
Application Rule 2 # Apply*move-north*remove-move # If the move-north successfully performs a move # command, then removethe command from the # output-link sp {apply*move-north*remove-move (state <s> ^operator.name move-north ^io.output-link <ol>) (<ol> ^move <move>) (<move> ^status complete) --> (<ol> ^move <move> -)}
Multiple Rule Processing • In SOAR both the operator proposal and application phases can be expanded • Multiple rules fire and retract in parallel until the system researches equilibrium or quiescence • Consider the Move-to-Food eater (it can sense walls, food, and bonus food)
Move-to-Food • Propose move-to-food If there is normal food in an adjacent cell then propose move in that direction • Propose move-to-food-bonus-food If there is bonus food in an adjacent cell then propose move in that direction and allow operator selection to be random
Move-to-Food • Apply move-to-food If move-to-food op selected then generate an output command • Apply move-to-food-remove-move If move-to-food op is selected and there is a completed move or <ol> then remove move command
Move to Food # From Chapter 4 of Soar 8 Tutorial 2 # Propose*move-to-food*normalfood # If there is normalfood in an adjacent cell, # propose move-to-food in the direction of that # celland indicate that this operator can be # selected randomly the = preference. sp {propose*move-to-food (state <s> ^io.input-link.my-location.<dir>.content << normalfood bonusfood >>) --> (<s> ^operator <o> + =) (<o> ^name move-to-food ^direction <dir>)}
Move to Food # Apply*move-to-food # If the move-to-food operator for a direction is # selected, generate an output command to move in that # direction. sp {apply*move-to-food (state <s> ^io.output-link <ol> ^operator <o>) (<o> ^name move-to-food ^direction <dir>) --> (<ol> ^move.direction <dir>)}
Move to Food # Apply*move-to-food*remove-move: # If the move-to-food operator is selected, # and there is a completed move command on the output # link,then remove that command. sp {apply*move-to-food*remove-move (state <s> ^io.output-link <ol> ^operator.name move-to-food) (<ol> ^move <move>) (<move> ^status complete) --> (<ol> ^move <move> -)}
Generalized Move Operator • Check for adjacent square • Do not propose moving into a wall • Look for normal or bonus food • Avoid other eaters • Note the use of list notation to simplify rule syntax << .. list .. >>
Advanced Move Operator • Avoids thrashing (moving back and forth between the same two cells) • Has a preference for moving to bonus food when available • The jump operator code can be merged with the advanced move operator to allow an eater to jump over a cell that does not contain a wall
Jump Operator - 1 • There are only two differences between the proposal for the move operator and the jump operator. • The first difference is that the name of the operator should be jump, not move. • The second is that the jump operator needs to test that a cell two moves away in a direction does not contain a wall.
Jump Operator - 2 • This is easy to add because every cell has the same four directional pointers • So the desired cell can be tested via the direction augmentation on the adjacent cell: instead of just <dir> use <dir>.<dir> which tests two steps in the same direction because the same directional pointer must match both uses of <dir>.
Jump and Move • There are many strategies for selecting jump operators. • One simple strategy is to prefer operators that jump into cells with bonusfood to operators that move into empty cells, while rejecting operators that jump into empty cells. • It is possible generalize some of the rules for move to cover both jump and move.
Jump and Move • To aid in rule selection, it is possible to translate the different names and contents into numbers that correspond to the number of points the eater will get. • A rule could then compare the numbers and create better preferences for operators with higher numbers.