Means-ends analysis

Means-ends analysis Northwestern UniversityCS 395 Behavior-Based Robotics Ian Horswill

Review Robot operates in an environment • State space S • Set of possible motor outputs A • Dynamics (physics) that determines how the environment changes state • Continuous dynamics (continuous-time actions) f = dS/dt: SA  S, f = d2S/dt2: SA  S, etc. • Discrete dynamics (atomic/ballistic actions, discrete time) : SA  S

Review Want to construct a policy to make the robot do the right thing • p: S  A • Complete environment-robot system evolves • Continuous case: curve through state-space ds/dt = f(s,p(s)) • Discrete case: system evolves through series of states • s0 • s1 = (s0, p(s0)) • s2 = (s1, p(s1)) = ((s0, p(s0)), p((s0, p(s0)))) • Etc.

Error feedback control • Goal state sg • Control action computed from error • ds/dt = f(s- sg) • d2s/dt2 = f(s- sg) • Linear feedback control • f is a linear operator • ds/dt = k(s- sg) • P control (proportional control) • k is a gain you multiply by • k is a matrix when s is a vector • d2s/dt2 = kp(s- sg)+ kd ds/dt + ki∫s dt • PID control

Behavior-based control(“Bottom-up”) Combine policies by running them in parallel • Behavior = policy + trigger • Bottom-up integration of behaviors • Map several behaviors to a single composite behavior (or composite policy) • Several different composition operators • Behavior-or (prioritization/subsumption) • Behavior-+ (motor schemas/potential fields) • Behavior-max • Weighted voting • Etc.

Plan-based control(“Top-down”) Combine policies by running them serially • Behaviors → atomic actions • Still policy + activation level • Externally triggered • Self-terminating • Combine behaviors using serial controllers (plans) • Finite state machines • Individual states can • Trigger actions • Wait for them to terminate • Wait for other external conditions • Etc.

Planning-based control(“Top-down”) Combine policies “non-deterministically” • Idea: “guess the action that will ultimately work” • i.e. guess the one that leads to the goal • Problem: this doesn’t help much • Don’t know which action(s) will ultimately work • If you guess wrong, you’re screwed • Solution: simulation • Run actions in simulation • Search through possible sequences of actions (plans) to find one that works and remember it • Execute the successful plan in the real world

Logic-based representations of the state space • Represent states using propositions(true/false statements) • Find a set of propositions that let you distinguish all the states you care about • State = truth of each proposition • Advantage: partial state descriptions • P^Q is the set of states in which both P and Q are true

Means-ends analysis • Pair goals up with actions • For each proposition, keep track of the actions that can make it true • For each action, write the precondition (partial state descriptions) for being able to run it • To solve the goal P^Q • Look up the action A that achieves P • Recursively solve precondition(A) • Run A • Recursively solve Q without “clobbering” P

GPS (Newell and Simon) • “General Problem Solver” • Used means-ends analysis • Assumed priority ordering on propositions • Algorithm: GPS(goal)while goal not yet true p = highest priority unsatisfied subgoal (subgoal = proposition inside of goal) a = action to solve p GPS(precondition(a)) do a

The STRIPS representation Define actions in terms of • Add list: propositions the action makes true • Delete list: propositions the action may make false • Precondition list: propositions that must be true in order to run the action

Planning with STRIPS • Goal = set of propositions to make true • Algorithm: STRIPS(initial, goal)for each subgoal p in goal not in initial for each action a with p in its add list try the plan: STRIPS(initial, precondition(a)) a STRIPS(initial-delete_list(a) +add_list(a), goal) if both recursive calls worked, we win else, try another action, or another subgoal

Reactive planning in fully reactive systems • Collection of behaviors • Each behavior achieves some goal(s) • Each behavior has some precondition(s) • Higher level system drives some goal signal • Goal signals • Activate behaviors that achieve them • Drive goal signals of preconditions • Examples:GAPPS (Kaelbling 90), Behavior Networks (Maes 90)

GRL example • Extended STRIPS representation • Operators are just behaviors that activate themselves when they can achieve a goal. (define-operatorname motor-vector (achieves add-list-signals …) (clobbers delete-list-signals …) (preconditionsprecondition-signals ...) (serial-preconditions precondition-signals ...) (required-resources names ...) (priority number))

Computing activation-levels • A behavior is runnable if all its preconditions are satisfied • It is desirable if • It satisfies a maintenance goal • It satisfies some unsatisfied goal of achievement • It is unconflicted if • It doesn’t clobber a satisfied goal or a maintenance goal, and • None of its resources are required by desirable operators of higher priority

Compile-time property lists (define-signal-property (add-list x) '()) (define-signal-property (delete-list x) '()) (define-signal-property (preconditions x) '()) (define-signal-property (serial-preconditions x) '()) (define-signal-property (priority x) 0) (define-signal-property (required-resources x) '())

Making the behavior (letrec ((the-behavior (behavior (run? the-behavior)motor-vector-signal))) the-behavior) (define-signal (run? x) (and (desirable? x) (runnable? x) (not (conflicted? x))))

Computing desirable? (define-signal-modality (desirable? x) (define adds (add-list x)) (signal-expression (or (apply or (unsatisfied-goal adds)) (apply or (maintain-goal adds))) (drives (goal (runnable? x))) (operator x)))

Computing runnable? (define-signal-modality (runnable? x) (signal-expression (parallel-and (apply parallel-and (preconditions x)) (apply serial-and (serial-preconditions x))

Gatherer functions • Accumulators can be declared with a gatherer • The gatherer is called by the compiler with a list of all signals being compiled • It returns the signals that should be used as inputs to the accumulator • Gatherers are called after signal expansion • They’re only passed the list of primitive signals into which calls to signal procedures have been expanded

Computing conflicted? (define-signal-procedure (conflicted? x) (define my-priority (priority x)) (let ((high-priority-clobbered-goals (filter (lambda (g) (>= (priority g) my-priority)) (delete-list x)))) (signal-expression (apply or (accumulate or (make-conflict-set-gatherer x)) (satisfied-goal ,high-priority-c-goals)))))

Computing conflicted? ;;; A signal is conflicted if some other higher priority signal needs;;; one of its resources or if it clobbers a goal we’ve already achieved. ;;; Since confliced? already checked for clobbering, we only need to;;; worry about finding operators that need our resources.(define (make-conflict-set-gatherer me) (lambda (ignore signal-list) (define resources (required-resources me)) (define (desired-resource? r) (memq r resources)) (define (steals-my-resource? op) (any desired-resource? (required-resources op))) … ))

Computing conflicted? (define (make-conflict-set-gatherer me) (lambda (ignore signal-list) … (define my-priority (priority x)) (define (higher-priority? op) (define pri (priority op)) (cond ((= pri my-priority) (error "Conflict detected between actions of equal priority“ me op)) (else (> pri my-priority)))) …))

Computing conflicted? (define (make-conflict-set-gatherer me) (lambda (ignore signal-list) … (define (conflictor? desirable?-sig) (define op (operator desirable?-sig)) (and (not (eq? op me)) (steals-my-resource? op) (higher-priority? op)));; Now really compute the list of inputs (filter conflictor? signal-list))

A pathetic example (define-operator make-happy 'running-make-happy (achieves happy?) (preconditions (maintain content?)) (requires-resources a)) (define-operator make-happy2 'running-make-happy2 (achieves happy2?) (serial-preconditions not-depressed? content?) (requires-resources a b) (priority 5)) (define-operator make-content'running-make-content (achieves content?) (clobbers happy?) (priority 0))

A pathetic example (define-operator make-not-depressed 'running-make-not-depressed (achieves not-depressed?)) (define-signal doit (behavior-or make-happy make-happy2 make-content make-not-depressed))

Compiled code > (compile doit)(begin (define (run) (update-grl-time!) (let* ((desirable?-make-happy #f) …) (while #t (update-grl-time!) (before-signal-update) (set! desirable?-make-happy (and want-happy? (not happy?))) (set! desirable?-make-happy2 (or (and want-happy2? (not happy2?)) stay-happy2?)) (set! make-happy-activation-level (and desirable?-make-happy content? (not desirable?-make-happy2))) (set! make-happy2-activation-level (and desirable?-make-happy2 not-depressed? content? (not #f))) (set! make-content-activation-level (and (or (and desirable?-make-happy2 not-depressed? (not content?)) desirable?-make-happy) (not (and want-happy? happy?)))) (set! doit-activation-level (or make-happy-activation-level make-happy2-activation-level make-content-activation-level (and desirable?-make-happy2 (not not-depressed?) (not #f)))) (set! doit-motor-vector (cond (make-happy-activation-level 'running-make-happy) (make-happy2-activation-level 'running-make-happy2) (make-content-activation-level 'running-make-content) (else 'running-make-not-depressed))) (after-signal-update)))))

Means-ends analysis