310 likes | 803 Views
Operant Conditioning. Year 12 Psychology Unit 4 Area of Study 1 (chapter 10 , page 476 ). Trial and Error Learning. Learning by trying different possibilities until the correct outcome is achieved.
E N D
Operant Conditioning Year 12 Psychology Unit 4 Area of Study 1(chapter 10, page 476)
Trial and Error Learning • Learning by trying different possibilities until the correct outcome is achieved. • Also known as ‘instrumental learning’ because the individual is ‘instrumental’ in learning the correct response. • More recently known as ‘Operant Conditioning’ because the individual ‘operates’ on the environment to solve a problem.
Trial and Error Learning:Edward Thorndike’s Cats • First studies of trial and error learning; he was interested in the study of animal intelligence. • Hungry cat put in a ‘puzzle box’; piece of fish put outside box (could be seen and smelt but was just out of cat’s reach). • To get fish, cat had to push a lever to open door on side of box. • Learning was measured as the time it took to escape from the box. • Cat tried numerous ineffective strategies (trial and error). • Eventually, cat accidentally pushed the lever and the door opened. • The cat was then rewarded with the food. • Cat put into the box again to repeat test: each time, cat used trial and error but became progressively quicker at using the lever. • Number of incorrect behaviours was also reduced • After approximately 7 trials, cat went directly to lever. • It became a deliberate response due to the cat learning the positive consequence of making that response.
Trial and Error Learning:Edward Thorndike’s Cats Activity: 10.11 • Based on his results, Thorndike developed the Law of Effect: • Behaviour that is accompanied or followed by ‘satisfying’ consequences is strengthened (more likely to occur). • E.g. pushing the lever is followed by getting the fish. • A behaviour that is followed by an ‘annoying’ consequence is weakened (less likely to occur). • E.g. not pushing the lever (doing anything else) is followed by still being stuck in the box.
Operant Conditioning • An organism will tend to repeat behaviours (operants = responses) that have desirable consequences (i.e. rewards) or that will enable it to avoid undesirable consequences (i.e. punishments). • Also, an organism will tend not to repeat behaviours that lead to undesirable consequences. • Stemmed from Thorndike’s work on ‘Instrumental Learning’ with cats. • Most famous experiments in Operant Conditioning were conducted by B.F. Skinner using his ‘Skinner Box’.
Operant Conditioning:Three-Phase Model • Based on Thorndike’s law of effect. • Stimulus (S); • Operant Response (R); • Consequence (C); • Sometimes also referred to as (S) because it is a stimulus in the form of a consequence. SO, S R C Where the probability of (R) occurring after (S) depends on the previous experiences of (C).
Operant Conditioning:Skinner’s Rats • Hungry rat was placed in a Skinner Box. • Scurried around randomly touching floor, walls etc. • Eventually accidentally pressed lever, which dispensed a food pellet: rat ate. • Rat continued random movements and eventually pressed the lever again: rat ate. • With additional repetitions of lever pressing followed by food, the rat’s random movements began to disappear and were replaced by more consistent lever pressing. • Eventually the rat was pressing the lever as fast as it could eat each pellet. • Pellet was a reward (reinforcer) for the correct response.
Elements ofOperant Conditioning Activities: 10.13 & 10.17 • Reinforcement: applying a reward/positive stimulus (positive reinforcement) or removing a negative stimulus (negative reinforcement) to encourage the production of desired behaviour. • Reinforcer: any object/event that increases the probability that an operant behaviour will occur again. • Punishment: applying a negative/unpleasant stimulus to discourage unwanted behaviour. • Schedules of Reinforcement: frequency and manner in which a desired response is reinforced (either positively or negatively).
Punishment Positive Reinforcement (Reward) Negative Reinforcement (if they lay eggs, they don’t get cooked!)
Elements of Operant Conditioning:Schedules of Reinforcement Activity: 10.14 • Continuous Reinforcement: reinforcer is applied immediately after every correct response/behaviour. • Partial Reinforcement: reinforcer is only applied after some correct responses, but not all. More difficult to change the behaviour, more resistant to extinction. • Ratio: reinforcement given after a certain number of correct responses. • Interval: reinforcement given after a certain amount of time has passed since the last correct response. • Fixed: reinforcement given on a regular basis, such as after every 3rd response or ever 10 seconds. • Variable: reinforcement given in an unpredictable or random way.
Elements of Operant Conditioning:Schedules of Reinforcement • So, using the info from the previous slide, there are four main schedules of partial reinforcement: • Fixed-ratio schedule: ? • Variable-ratio schedule: ? • Fixed-interval schedule: ? • Variable-interval schedule: ? See pages 484-485
Elements of Operant Conditioning:Schedules of Reinforcement • So, using the info from the previous slide, there are four main schedules of partial reinforcement: • Fixed-ratio schedule: reinforcer given after a set (fixed) number (ratio) of correct responses. • Variable-ratio schedule: reinforcer given after an unpredictable (variable) number (ratio) of correct responses. • Fixed-interval schedule: reinforcer given after a set (fixed) period of time (interval) since the last correct response. • Variable-interval schedule: reinforcer given after an unpredictable (variable) period of time (interval) since the last correct response.
Factors That Influence theEffectiveness of Operant Conditioning • Order of Presentation: reinforcement/punishment must be presented after behaviour so that it is learned as a consequence of that behaviour. • Timing: reinforcement/punishment are most effective when presented immediately after behaviour (also increases strength of response). • Appropriateness: reinforcement/punishment must be specific to the likes/dislikes of the individual (otherwise my ‘reward’ could be your ‘punishment’).
Key Processes in Operant Conditioning Activity: 10.21 • Acquisition: speed may vary depending on complexity of behaviour being learned. • Extinction: less likely to occur when partial reinforcement is used. • Organism is used to not getting reinforcer every time. • Spontaneous Recovery, Stimulus Generalisation and Stimulus Discrimination: same as when discussed in Classical Conditioning.
Applications of Operant Conditioning Activity: 10.22 • Shaping: reinforcement is given for each response that moves closer to the final goal behaviour. • e.g. teaching a baby to talk: “Ddd”, “Daaa”, “Dad”. • Also known as ‘method of successive approximations’. • Token Economies: reinforcers (tokens) are given for desired behaviour and can then be exchanged for other reinforcers (rewards). • Tokens may also be removed as punishment. • Ensures reinforcement (reward) is appropriate. • Could backfire if token is misunderstood or underlying cause of behaviour is not addressed (see page 500).
Classical vs. Operant Conditioning Activity: 10.26 • Role of the Learner: • Passive (classical) vs. active (operant). • Timing of the Stimulus and Response: • Immediate (classical) vs. delayed (operant); • Response depends on stimuli (classical) vs. reinforcer depends on response operant; • Nature of the Response: • Reflexive/involuntary (classical) vs. voluntary (operant).
Reminders… • The next section of your textbook is ‘One-Trial Learning’ but we have already discussed this in the Classical Conditioning slides. • Page 507 outlines a good experiment. • Don’t forget to keep track of the key knowledge dot points that we are covering and tick each one as you become confident with it. • The person who can best monitor your progress and understanding is YOU – don’t cheat yourself. • Miss Moore is awesome. As if you’d forget that.