Operant Conditioning

Psychology 001Introduction to PsychologyChristopher Gade, PhDOffice: 621 HeafeyOffice hours: F 3-6 and by apt. Email: gadecj@gmail.com Class WF 7:00-8:30 Heafey 650

Operant Conditioning • The process of learning to associate a behavior with a consequence, in order to behave in a manner that maximizes reinforcing and minimizes punishing events. • Reinforcement: any event that increases the future probability of the most recent behavior. • Punishment:any event that decreases the frequency of the preceding behavior • What makes something reinforcing or punishing? • Biologically useful • Intrinsically satisfying • Restores equilibrium • * Disequilibrium principle: any behavior that leads to a return to equilibrium will be reinforcing.

Edward Thorndike (1874-1949) • Originated the idea of instrumental learning. • Studied cats and other animals learning by trial and error to escape from puzzle boxes.

The Thorndike Laws • Law of Effect: Behaviors followed by favorable consequences become more likely; behaviors followed by unfavorable consequences become less likely. • Law of Readiness: A series of responses can be chained together if they belong to the same action sequence and will result in annoyance if blocked. • Law of Exercise: Connections become strengthened with practice and weakened when practice is discontinued.

B. F. Skinner (1904-1990) • Skinner attempted to expand on Thorndike’s original theories of instrumental learning. He proposed that the learning process has a very predictable response to rewards and punishments. His work set out to show how those responses to behavior influenced future behaviors (e.g. operant conditioning). The majority of Skinner’s work was done on rats and pigeons in elaborate boxes that he designed. These boxes were called “Skinner Boxes”.

15:30

But there’s more… Not only are there reinforcements and punishments in operant conditioning, but these responses are either positive (adding something), or negative (taking something away). • The 2x2 Matrix of Operant Conditioning: • Positive reinforcement: an introduction of a pleasurable stimulus, which will increase the likelihood of the future occurrence of the behavior (e.g. chocolate cake). • Negative reinforcement: a removal of an aversive stimulus, which will increase the likelihood of the future occurrence of the behavior (e.g. nagging). • Positive punishment: an introduction of an aversive stimulus, which will decrease the likelihood of the future occurrence of the behavior (e.g. spanking). • Negative punishment: a removal of a pleasurable stimulus, which will decrease the likelihood of the future occurrence of a behavior (e.g. taking away your allowance).

What type of reinforcement is this?

What about this one?

A final review of the 2x2 matrix…

A review of other related behavioral concepts… • Extinction: behavior without reinforcement. • Stimulus Generalization: responding to a stimulus that is similar to the originally reinforced stimulus. • Discrimination: not responding to a stimulus that does not result in reinforcement.

Different Schedules of Reinforcement • Continuous reinforcement: reinforcement for every correct response • Partial/intermittent reinforcement: occasional reinforcement for a correct response • a. Fixed ratio: Reward for a behavior after “X” responses. Causes faster responders to get more rewards. Produces high rates of responding, but quick extinction when the reinforcement is removed. • b. Variable ratio: Reward for a behavior after a variable and unpredictable numbers of responses. Gambling is a great example of this reward system. It is very hard to extinguish after the connection is made. • c. Fixed interval: Reward for a behavior after “X” amount of time has passed. The responses are rather sparse in down time, but get more vigorous right before time X. • d. Variable interval: Reward for a behavior after a variable and unpredictable amount of time. This causes slow, steady responding.

Number of responses Fixed Ratio 1000 Variable Ratio Fixed Interval 750 Rapid responding near time for reinforcement 500 Variable Interval 250 Steady responding 0 10 20 30 40 50 60 70 80 Time (minutes) Responses x Time Diagram

Effectiveness of Reinforcement • All things being equal, most people learn fastest with immediate reinforcement or immediate punishment • Punishment tends to be ineffective except for temporarily suppressing undesirable behavior • Mild, logical and consistent punishment can be informative and helpful • Though vicarious reinforcement can be effective, vicarious punishment is often not

And now… • After learning about how we take in information, we’re going to examine how we keep that information in our head. • Namely, this next series of lectures is going to discuss memory, and the way the mind stores information. • See you in the next class.

Operant Conditioning