590 likes | 1k Views
Today. Introduce instrumental conditioningHow it's different from Pavlovian conditioningDescribe theoretical accounts of instrumental conditioning. Objectives. At the end of this lecture, students should be able to:Describe instrumental conditioning proceduresDiscuss what is learnt in instrument
1. Contemporary Learning Theory Dr Pam Blundell
Lecture Five
2. Today Introduce instrumental conditioning
How its different from Pavlovian conditioning
Describe theoretical accounts of instrumental conditioning
3. Objectives At the end of this lecture, students should be able to:
Describe instrumental conditioning procedures
Discuss what is learnt in instrumental conditioning
Evaluate formal models of instrumental conditioning
4. Reading Dickinson p102-122
Pearce Ch 4
5. Instrumental conditioning Both prediction and control are required for successful adaptation in a changing environment
6. Instrumental conditioning Instrumental behaviour refers to those actions whose acquisition and maintenance depend upon the fact that the action is instrumental in causing some outcome
Allows us (and animals) to control our environment in service of our needs and desires
7. Instrumental conditioning Consider approach to a food source
Hungry chick will learn to approach a bowl
Instrumental analysis suggests the animal is sensitive to the contingency between its own behaviour, and access to food
Pavlovian account suggests that predictive relationship between bowl and food is important
8. Instrumental conditioning In a stable environment, we cannot discriminate between those two accounts of the behaviour
We need to change the causal structure of the environment to determine what is governing behaviour
9. Hershberger (1986) Arranged a looking glass world
Approach to a food bowl actually increased the distance to the food bowl
Pavlovian animal (insensitive to the consequences of its actions) would never be able to adapt
Instrumental animal would learn to withdraw
Chicks showed little evidence of learning to run away, across 100 minutes training thus not sensitive to instrumental contingencies
10. Miller & Konorski (1969) Passive dog legflex, in presence of stimulus, paired with food.
After a number of pairings, dog began to flex leg in presence of stimulus
At odds with the notion of stimulus substitution
Termed type II conditioning.
But doesnt demonstrate the instrumental character of type II conditioning
11. Grindley (1932) Guinea pigs
Trained to turn head to left of right when buzzer sounded to receive food
Reversed the contingency (ie making them turn the head the other way)
Animals could perform this
So S-O (pavlovian) kept constant (buzzer-food)
Behaviour changed
12. Instrumental conditioning Many instrumental tasks have a strong Pavlovian component
Pigeon key peck
Free operant lever pressing in rats is a fairly pure instrumental task
13. Free operant lever pressing Is sensitive to contingency reversal (David & Bitterman, 1971)
Trained rats to lever press
Then changed contingency either no contingency, or press postponed delivery of food
Postponed group reduced responding more
14. Bolles, Holtz, Dunn & Hill (1980) Trained rats to press a lever down and push a lever up for food which action randomly determined, so rats tended to alternate
Punish one category of responding with shock
Suppression only of the response that was punished sensitivity to the consequences of their actions
16. What is learned in instrumental conditioning? Earliest explanation of instrumental conditioning is the Law of Effect (Thorndike)
Association between stimulus and response, strengthened by presentation of a reinforcer
17. What is learned in instrumental conditioning As is Pavlovian conditioning, this suggests no knowledge of the consequences of the action
Instrumental action is simply a habitual response triggered by the training stimuli
Drive will potentiate habits
18. Tolman (1932, 1959) Cognitive theory of instrumental action
Belief about consequences of action (mean-end readiness)
Value assigned to outcome, interacts with expectancy to produce the behaviour
19. How to distinguish between the SR and the cognitive account? As in Pavlovian conditioning, examine the effects of changing reward value!
20. Adams and Dickinson (1981) (see also Colwill & Rescorla 1986)
LP1? food1
Food 2 delivered non contingently
4 groups: PF, PS, UF, US (paired, unpaired, food, sucrose)
Test lever pressing in extinction
22. Dickinson & Adams (1981) Supports the cognitive theory of instrumental action
BUT some residual responding to the paired
When training reintroduced, devalued foods ineffective reinforcers, so residual responding not due to ineffective devaluation
Perhaps both SR and cognitive going on??
24. What is learned in instrumental conditioning Problem with Tolmans account no specification of the psychological mechanism by which expectancies, beliefs, values interact, causing instrumental action
25. Bidirectional theory Pairing two events not only causes a forward connection between them, but also a backwards connection
26. Bidirectional theory If E1 is instrumental action and E2 is the reinforcer
27. Bidirectional theory Gormezano & Tait (1976)
E1: Airpuff
E2: Water delivery
If backward associations form, then water delivery should elicit eyeblink
But it doesnt
28. Bidirectional theory Cant explain punishment
Action is reduced by punishment
29. Associative-cybernetic model Associative: involves the formation of a connection between representation of action and outcome
Cybernetic: activation of the outcome representations feedback to modulate performance
31. Habit memory Array of stimulus detecting units linked to an array of response units
Corresponds to URs or pretrained responses
33. Associative memory Representations of actions, and outcomes
Performance of an action activates the representation of that action
Contiguous activation of actions in habit and associative memory allows growth of a connection between the habit and associative representations of the action
Making a response becomes associated with the outcome of that response
35. Incentive system Any event in associative memory that has motivational significance has associations with units in the incentive system
Innate? But learnable! (see next lecture)
Activation of reward units exert a general and indiscriminate excitation on units in the motor system
Similarly, activation of punishment units inhibits all units in the motor system
37. Associative cybernetic model Important associations:
Habit response --- associative action
(ability to detect and represent the animals own behaviour)
Associative action associative outcome
(ability to detect and represent contingency between action and outcome)
Associative outcome --- reward incentive system
(Represents the desires of the animal)
38. Representations of instrumental actions Habit response --- associative action
(ability to detect and represent the animals own behaviour)
Shettleworth (1975) compared sensitivity of a variety of behaviours to food conditioning in hamsters
Rearing could be conditioned
Face washing and scratching couldnt
39. Morgan & Nicholas (1979) Presented two levers into operant chamber, following rat either face washing or rearing
Animal had to make one response if it had just reared, or the other if it had just washed
They could learn this
Second group, scratching and face washing
Couldnt learn this
Scratching and face washing are poorly represented (no associative memory of scratching in ass-cyb model)
41. Heyes Observational learning
Seeing a conspecific carrying out an action also activates associative representation of the action
Observer rats pushed pole in same direction as demonstrator rats
42. Instrumental learning Associative action associative outcome
(ability to detect and represent contingency between action and outcome)
In instrumental conditioning, it is the causal relationship that determines behaviour.
BUT can instrumental behaviour be explained simply by a sensitivity to the temporal contiguity of events?
43. Contiguity
44. Contiguity Animals are certainly sensitive to the contiguity between action and outcome
BUT learning is still maintained, even with a 30s gap between action and outcome
Perhaps the sensitivity to contiguity is due to a difficulty in discriminating a causal relationship in which A?O, from a noncontingent schedule in which outcome occurs frequently, but independent of behaviour
45. Contingency Hammond (1980)
Varied P(O/A) and P(O/-A)
47. Contingency With P(O/-A)=0, higher pressing with higher P(O/A)
As P(O/-A) decreases, so does responding
So outcomes following no response dont act as a delayed reinforcer (which should increase responding), and animals appear sensitive to the causal relationships
48. Alternatively Perhaps non contingent reinforcers enhanced Pavlovian approach behaviours at the expense of the instrumental responses?
49. Dickinson & Mulatero (1989) (also see Dickinson, Campos, Varga, & Balleine 1996)
L1? food; L2 ? sucrose
Present non contiguous food
Only L1 responding reduced
Rats are sensitive to the contingencies
50. However
We can still claim contiguity as a crucial element of conditioning, as we can explain the results of Hammond very easily
Context food associations will be higher in the non continguous groups, which will block learning the actionfood associations (recall blocking).
51. Signalling the noncontiguous outcomes increases the rate of instrumental lever pressing (context is overshadowed by the signal, Dickinson & Charnock 1985)
52. In summary Simple contiguity based learning process provides as account of the sensitivity of instrumental performance to variations in the casual effectiveness of an action
BUT effect of schedules
53. Ratio vs interval schedules Ratio schedules: the more you press the more you earn! FR15. VR20. RR20
Interval schedules: rate of responding independent of how much reinforcement received. FI15. VI25. RI30.
54. Dawson & Dickinson 1990 Trained rats to chain pull on either RR20 or RI schedules.
IRI on RI schedules was determined by a yoked animal on a RR20 schedule
Temporal distribution of reinforcers was matched
56. P(O/A) higher on RI schedule
But RR schedule produces more responding
Sensitive to the causal relationship between performance and reward rates
Further evidence from Dickinson (1983) which found Ratio schedules more sensitive to reward devaluation
58. Summary Instrumental action mediated by two systems
Associative cybernetic system
Next time well discuss incentive learning, and Pavlovian instrumental interactions!