530 likes | 877 Views
Chapter 5 – Instrumental Conditioning: Foundations. Outline 1 Comparison of Classical and Instrumental Conditioning Early Investigations of Instrumental Conditioning. Thorndike Chicks and mazes Cats and puzzle box Modern approaches to the study of instrumental conditioning
E N D
Chapter 5 – Instrumental Conditioning: Foundations • Outline 1 • Comparison of Classical and Instrumental Conditioning • Early Investigations of Instrumental Conditioning. • Thorndike • Chicks and mazes • Cats and puzzle box • Modern approaches to the study of instrumental conditioning • Discrete trials Procedures • Free-operant procedures • Magazine training and shaping • Response rate as measure of operant behavior • Instrumental Conditioning Procedures • Positive Reinforcement • Negative Reinforcement (Escape or Avoidance) • Positive Punishment (Punishment) • Negative Punishment (Omission Training/DRO)
Comparison of Classical and Instrumental Conditioning. • Classical = S-S relationship • Light-shock – elicits fear • Tone-food – elicits salivation • In Classical Conditioning there is no response requirement. • Instrumental = R-S relationship • We will refer to it as R-O • Behavior (Response) is instrumental in producing the outcome • Press lever – Get food • Pull lever – Get money • In Instrumental Conditioning a particular response is required
Keep in mind Classical and Instrumental conditioning are approaches to understanding learning. • not completely different kinds of learning. • Many learning situations could be described by either approach. • Child touches hot stove • CS, US, UR, CR? • Pavlovian = fear stove • Instrumental = less likely to approach • Conditioned Taste Aversion • S-S = Taste – LICl • R-O= Drink liquid - get sick (punished)
Early Investigations of Instrumental Conditioning. • Edward Lee Thorndike (American) • Same time as Pavlov • Late 1800s; early 1900s • interested in animal intelligence. • late 19th century - many people believed that animals reasoned – like people do • Romanes • Stories of amazing abilities of animals. • biased reporting? • Report interesting behavior • ignore stupid
Thorndike (1898) • “Dogs get lost hundreds of times and no one ever notices it or sends an account of it to a scientific magazine, but let one find his way home from Brooklyn to Yonkers and the fact immediately becomes a circulating anecdote. Thousands of cats on thousands of occasions sit helplessly yowling, and no one takes thought of it or writes to his friend, the professor; but let one cat claw at the knob of a door supposedly as a signal to be let out, and straightway this cat becomes representative of the cat-mind in all the books…In short, the anecdotes give really the…supernormal psychology of animals.”
Thorndike attempted to understand normal or ordinary animal intelligence. • Chicks in a maze • Cats in a box
Food Start
Thorndike tested many animals • chicks, cats, dogs, fish, and monkeys. • little evidence for reasoning. • Instead learning seemed to result from trial and accidental success
Modern approaches to the study of instrumental conditioning • Discrete trials Procedures • Like Thorndike’s work • Simpler mazes though • Figure 5.3 • Straight alleyway (running speed) • T-Maze (errors) • Later • radial arm maze • Morris Water Maze • Note • each run is separated by an intertrial interval • Just like in Pavlovian Conditioning
Free-operant procedures • There are no “trials” • The animal is allowed to behave freely • Skinner Box • an automated method for gathering data from animals
Skinner • Rat in Operant Chamber
Skinner used these boxes to study operant conditioning • operant • any response that “operated” on the environment • Defined in terms of the effect it has on the environment
Pressing a lever for food • Doesn’t matter how the rat does it • Right paw, left paw, tail • As long as it actuated the switch • Similar to opening a door • Doesn’t matter which hand you use • Or foot (carrying groceries) • Just as long as the result is achieved
Magazine training and shaping • First have to train the animals about availability of food • training by successive approximations (shaping) • Shaping a rat
Response rate as measure of operant behavior • In a free-operant situation you do not have measures such as percent correct, or errors. • Skinner used response rate as a primary measure • We will see later that various schedules of reinforcement affect response rate in various ways.
Instrumental Conditioning Procedures • First let’s get some terminology down • Positive • Behave = Stimulus applied • Negative • Behave = Stimulus removed • Reinforcement • Behavior increased • Punishment • Behavior decreased • 2 x 2 table
Application • Box 5.2 in book • Mentally disabled woman • Head banging behavior • Possibly to get attention - reward • Change contingencies • Ignore head banging • Social rewards when not head banging • Procedure? • Negative Punishment • Differential Reinforcement of Other Behavior (DRO)
Your book uses somewhat different terminology • Positive Reinforcement • Same • Escape or Avoidance • Negative Reinforcement • Punishment • Positive Punishment • Omission Training/DRO • Negative Punishment • We will use my terminology
O • Outline 2 • Fundamental Elements of Instrumental Conditioning • The Response • Behavioral variability vs. Stereotypy • Relevance or Belongingness • Behavior Systems and Constraints on Instrumental conditioning • The Reinforcer • Quantity and Quality of the Reinforcer • Shifts in Reinforcer Quality or Quantity • The Response-Outcome Relation • Temporal Relation • Contingency • Skinner’s Superstition experiment • The Triadic Design • Learned helplessness hypothesis
Behavioral variability vs. Stereotypy • Thorndike emphasized the fact that reinforcement increases the likelihood of the behavior being repeated in the future • Uniformity – stereotypy • This is often true • You can increase response variability, however, by requiring it • Page & Neuringer, 1985 • Two keys (50 trials per session) • Novel group • Peck 8 times • Do not repeat a pattern that was used in the last 50 trials • LRLRLRLR • RLRLRLRL • LLRRLLRR • LLLLRRRR • Control group • RF 8 pecks • Doesn’t matter how they do it. • Figure 5.8
Relevance or Belongingness • We have discussed this in Classical Conditioning • Bright Noisy Tasty water • Peck differently for water and Grain • It has also been studied extensively in the Instrumental literature • Originally noted by Thorndike • Cat – puzzle box • Train cat to yawn to escape • Or scratch themselves • Did not go well
Brelands “The Misbehavior of Organisms” 1961 • Play on Skinner’s “Behavior of Organisms” (1938) • students of Skinner • training animals to do tricks as advertising gimmicks • raccoon and coin(s) • Shaped to pick up coin • Then to place in bank • Then 2 coins • pig and wooden coin
Instinctive drift • Arbitrarily established responses drift toward more innately organized behavior (instinct) • The arbitrary operant • place coin in bank • Instinctive drift • species specific behaviors related to food consumption • Wash food • Root for food
Behavior Systems and Constraints on Instrumental conditioning • Behavior Systems (Timberlake) • the response that occurs in a learning episode is related to the particular behavioral system that is active at the time • If you are a hungry rat and food is the reward • behaviors related to foraging will increase • If you are a male quail maintained on a light cycle that indicates mating season and access to a female is being offered • mating behaviors will be elicited
Behavior systems continued • The effectiveness of any procedure for increasing an instrumental response will depend on the compatibility of that response with behavioral system currently activated • rats pressing levers for food? • pigeons pecking keys for food? • Very easy to train • Even works for fish • Easy • bite a stimulus associated with a rival male • Swim through hoops for a stimulus associated with female • Difficult • Bite stimulus associated with access to female • Swim through hoops for access to rival male
The Reinforcer • How do qualities of the Reinforceraffect Instrumental Conditioning? • Quantity and Quality of the Reinforcer • Just like in Pavlovian conditioning. • More intense US Better conditioning • Trosclair-Lassere et al. (2008) • Taught autistic child to press button for social reward • Praise, hugs, stories • Social reward • 10 s • 105 s • 120 s • Progressive ratio
Magnitude of RF and drug abstinence • Perhaps not surprising • The more you pay the better addicts do.
. • Shifts in Reinforcer Quality or Quantity • How well a reinforcer works depends on what subjects are used to receiving • Mellgren (1972) - • Straight alleyway • Phase 1 • Half – found 2 pellets (low reward) • Half – found 22 pellets (high reward) • Phase 2 • Half from each group switched to opposite condition • Other half stay the same
4 conditions H-H high RF control L-L low RF control L-H Positive contrast H-L Negative Contrast Results – Figure 5.10 [note Domjan changed it to Small (S) and Large (L) rewards in text]
Response- Outcome Relation • Temporal relation? • contiguity • Causal relation? • Contingency • Independent • Contiguous doesn’t mean contingent • You wake up – sun rises • Contingent not always contiguous • Submit tax returns • Wait a few weeks for the money
Effects of temporal relationship • Hard to get animals to respond if long delay between response and reward • Delays are tough experiments to run using free operant procedure • Allow barpressing during “delay?” • Some barpresses likely close to RF • Enforce no barpressing after initial barpress? • Delay RF?
In study in book (Fig. 5.11) each bar press resulted in RF at a specified set time • For some animals the delay was short • 2 - 4 s • For some it was long • 64 s • Example (16 s delay) • Bar press 1 at 1 s (RF1 = 17 s) • Bar press again at 3 s (RF2 at 19s) • 14 s from RF1 • Bar press again at 12 s (RF3 at 28 s) • 5 s from RF1 • There will still be some “accidental” contiguity • Graph shows how responding is affected by actually experienced delays
Why are animals so sensitive to the delay between response and outcome? • Delay makes it difficult to determine which behavior actually caused RF • Press lever (contingent after 20 s) • Scratch ear • Dig in bedding on the floor • Rear up • Clean face • Reward • All of the other behaviors are more contiguous with RF than is lever pressing
A marking procedure can help maintain responding over a long delay • Provide a light or click after “target” response • Lever press – click .....20 s food • Helps animals bridge the gap
Response-Reinforcer Contingency • Skinner thought contiguity was more important than contingency • Superstition experiment • Superstition and bowling video from u-tube • relevant content begins at 3:12
Reinterpretation of Superstition Experiment • Behavioral systems again • Different kinds of responses occur when periodic RF is used • Focal search • Behaviors near food cup as time for RF approaches • “I know its coming” • Post-food focal search • Again - activity near cup • “Did I miss any?” • General search • Move away from cup • This is probably when skinner saw the turning and head tossing behaviors • “I have to wait, might as well look around” • “I am also a bit frustrated” • There is evidence for these patterns of responding • Staddon and Simmelhag (1971) • There is also evidence that slightly different patterns emerge with food vs. water RF.
Effects of the controllability of Reinforcers • Is having control a good thing? • We briefly mentioned the Brady Executive Monkey study earlier. • That study implied having operant control over outcomes could be bad for the animal • That study was confounded • Better evidence for the effects of control over outcomes comes from the learned helplessness literature