1 / 27

Means-ends analysis

Means-ends analysis. Northwestern University CS 395 Behavior-Based Robotics Ian Horswill. Review. Robot operates in an environment State space S Set of possible motor outputs A Dynamics (physics) that determines how the environment changes state

hanna-riggs
Download Presentation

Means-ends analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Means-ends analysis Northwestern UniversityCS 395 Behavior-Based Robotics Ian Horswill

  2. Review Robot operates in an environment • State space S • Set of possible motor outputs A • Dynamics (physics) that determines how the environment changes state • Continuous dynamics (continuous-time actions) f = dS/dt: SA  S, f = d2S/dt2: SA  S, etc. • Discrete dynamics (atomic/ballistic actions, discrete time) : SA  S

  3. Review Want to construct a policy to make the robot do the right thing • p: S  A • Complete environment-robot system evolves • Continuous case: curve through state-space ds/dt = f(s,p(s)) • Discrete case: system evolves through series of states • s0 • s1 = (s0, p(s0)) • s2 = (s1, p(s1)) = ((s0, p(s0)), p((s0, p(s0)))) • Etc.

  4. Error feedback control • Goal state sg • Control action computed from error • ds/dt = f(s- sg) • d2s/dt2 = f(s- sg) • Linear feedback control • f is a linear operator • ds/dt = k(s- sg) • P control (proportional control) • k is a gain you multiply by • k is a matrix when s is a vector • d2s/dt2 = kp(s- sg)+ kd ds/dt + ki∫s dt • PID control

  5. Behavior-based control(“Bottom-up”) Combine policies by running them in parallel • Behavior = policy + trigger • Bottom-up integration of behaviors • Map several behaviors to a single composite behavior (or composite policy) • Several different composition operators • Behavior-or (prioritization/subsumption) • Behavior-+ (motor schemas/potential fields) • Behavior-max • Weighted voting • Etc.

  6. Plan-based control(“Top-down”) Combine policies by running them serially • Behaviors → atomic actions • Still policy + activation level • Externally triggered • Self-terminating • Combine behaviors using serial controllers (plans) • Finite state machines • Individual states can • Trigger actions • Wait for them to terminate • Wait for other external conditions • Etc.

  7. Planning-based control(“Top-down”) Combine policies “non-deterministically” • Idea: “guess the action that will ultimately work” • i.e. guess the one that leads to the goal • Problem: this doesn’t help much • Don’t know which action(s) will ultimately work • If you guess wrong, you’re screwed • Solution: simulation • Run actions in simulation • Search through possible sequences of actions (plans) to find one that works and remember it • Execute the successful plan in the real world

  8. Logic-based representations of the state space • Represent states using propositions(true/false statements) • Find a set of propositions that let you distinguish all the states you care about • State = truth of each proposition • Advantage: partial state descriptions • P^Q is the set of states in which both P and Q are true

  9. Means-ends analysis • Pair goals up with actions • For each proposition, keep track of the actions that can make it true • For each action, write the precondition (partial state descriptions) for being able to run it • To solve the goal P^Q • Look up the action A that achieves P • Recursively solve precondition(A) • Run A • Recursively solve Q without “clobbering” P

  10. GPS (Newell and Simon) • “General Problem Solver” • Used means-ends analysis • Assumed priority ordering on propositions • Algorithm: GPS(goal)while goal not yet true p = highest priority unsatisfied subgoal (subgoal = proposition inside of goal) a = action to solve p GPS(precondition(a)) do a

  11. The STRIPS representation Define actions in terms of • Add list: propositions the action makes true • Delete list: propositions the action may make false • Precondition list: propositions that must be true in order to run the action

  12. Planning with STRIPS • Goal = set of propositions to make true • Algorithm: STRIPS(initial, goal)for each subgoal p in goal not in initial for each action a with p in its add list try the plan: STRIPS(initial, precondition(a)) a STRIPS(initial-delete_list(a) +add_list(a), goal) if both recursive calls worked, we win else, try another action, or another subgoal

  13. Reactive planning in fully reactive systems • Collection of behaviors • Each behavior achieves some goal(s) • Each behavior has some precondition(s) • Higher level system drives some goal signal • Goal signals • Activate behaviors that achieve them • Drive goal signals of preconditions • Examples:GAPPS (Kaelbling 90), Behavior Networks (Maes 90)

  14. GRL example • Extended STRIPS representation • Operators are just behaviors that activate themselves when they can achieve a goal. (define-operatorname motor-vector (achieves add-list-signals …) (clobbers delete-list-signals …) (preconditionsprecondition-signals ...) (serial-preconditions precondition-signals ...) (required-resources names ...) (priority number))

  15. Computing activation-levels • A behavior is runnable if all its preconditions are satisfied • It is desirable if • It satisfies a maintenance goal • It satisfies some unsatisfied goal of achievement • It is unconflicted if • It doesn’t clobber a satisfied goal or a maintenance goal, and • None of its resources are required by desirable operators of higher priority

  16. Compile-time property lists (define-signal-property (add-list x) '()) (define-signal-property (delete-list x) '()) (define-signal-property (preconditions x) '()) (define-signal-property (serial-preconditions x) '()) (define-signal-property (priority x) 0) (define-signal-property (required-resources x) '())

  17. Making the behavior (letrec ((the-behavior (behavior (run? the-behavior)motor-vector-signal))) the-behavior) (define-signal (run? x) (and (desirable? x) (runnable? x) (not (conflicted? x))))

  18. Computing desirable? (define-signal-modality (desirable? x) (define adds (add-list x)) (signal-expression (or (apply or (unsatisfied-goal adds)) (apply or (maintain-goal adds))) (drives (goal (runnable? x))) (operator x)))

  19. Computing runnable? (define-signal-modality (runnable? x) (signal-expression (parallel-and (apply parallel-and (preconditions x)) (apply serial-and (serial-preconditions x))

  20. Gatherer functions • Accumulators can be declared with a gatherer • The gatherer is called by the compiler with a list of all signals being compiled • It returns the signals that should be used as inputs to the accumulator • Gatherers are called after signal expansion • They’re only passed the list of primitive signals into which calls to signal procedures have been expanded

  21. Computing conflicted? (define-signal-procedure (conflicted? x) (define my-priority (priority x)) (let ((high-priority-clobbered-goals (filter (lambda (g) (>= (priority g) my-priority)) (delete-list x)))) (signal-expression (apply or (accumulate or (make-conflict-set-gatherer x)) (satisfied-goal ,high-priority-c-goals)))))

  22. Computing conflicted? ;;; A signal is conflicted if some other higher priority signal needs;;; one of its resources or if it clobbers a goal we’ve already achieved. ;;; Since confliced? already checked for clobbering, we only need to;;; worry about finding operators that need our resources.(define (make-conflict-set-gatherer me) (lambda (ignore signal-list) (define resources (required-resources me)) (define (desired-resource? r) (memq r resources)) (define (steals-my-resource? op) (any desired-resource? (required-resources op))) … ))

  23. Computing conflicted? (define (make-conflict-set-gatherer me) (lambda (ignore signal-list) … (define my-priority (priority x)) (define (higher-priority? op) (define pri (priority op)) (cond ((= pri my-priority) (error "Conflict detected between actions of equal priority“ me op)) (else (> pri my-priority)))) …))

  24. Computing conflicted? (define (make-conflict-set-gatherer me) (lambda (ignore signal-list) … (define (conflictor? desirable?-sig) (define op (operator desirable?-sig)) (and (not (eq? op me)) (steals-my-resource? op) (higher-priority? op)));; Now really compute the list of inputs (filter conflictor? signal-list))

  25. A pathetic example (define-operator make-happy 'running-make-happy (achieves happy?) (preconditions (maintain content?)) (requires-resources a)) (define-operator make-happy2 'running-make-happy2 (achieves happy2?) (serial-preconditions not-depressed? content?) (requires-resources a b) (priority 5)) (define-operator make-content'running-make-content (achieves content?) (clobbers happy?) (priority 0))

  26. A pathetic example (define-operator make-not-depressed 'running-make-not-depressed (achieves not-depressed?)) (define-signal doit (behavior-or make-happy make-happy2 make-content make-not-depressed))

  27. Compiled code > (compile doit)(begin (define (run) (update-grl-time!) (let* ((desirable?-make-happy #f) …) (while #t (update-grl-time!) (before-signal-update) (set! desirable?-make-happy (and want-happy? (not happy?))) (set! desirable?-make-happy2 (or (and want-happy2? (not happy2?)) stay-happy2?)) (set! make-happy-activation-level (and desirable?-make-happy content? (not desirable?-make-happy2))) (set! make-happy2-activation-level (and desirable?-make-happy2 not-depressed? content? (not #f))) (set! make-content-activation-level (and (or (and desirable?-make-happy2 not-depressed? (not content?)) desirable?-make-happy) (not (and want-happy? happy?)))) (set! doit-activation-level (or make-happy-activation-level make-happy2-activation-level make-content-activation-level (and desirable?-make-happy2 (not not-depressed?) (not #f)))) (set! doit-motor-vector (cond (make-happy-activation-level 'running-make-happy) (make-happy2-activation-level 'running-make-happy2) (make-content-activation-level 'running-make-content) (else 'running-make-not-depressed))) (after-signal-update)))))

More Related