PSY402 Theories of Learning

PSY402Theories of Learning Chapter 8, Theories of Appetitive and Aversive Conditioning

Operant Conditioning • The nature of reinforcement: • Premack’s probability differential theory • Response deprivation theory • Behavioral economics: • Behavioral allocation – blisspoint • Choice behavior – Herrnstein’s matching law. • Momentary maximization theory • Delay-reduction theory

Probability-Differential Theory • Premack – a reinforcer can be any activity that is more likely to occur than the reinforced behavior. • Manipulators vs eaters • High probability behaviors can be used as reinforcers of low probability behaviors. • Frequency of the reinforcer decreases when it is made contingent on another response.

Activities can be Reinforcers Playing with toys reinforces working math problems correctly

Response Deprivation Theory • Timberlake & Allison – deprivation occurs when an activity is used as a reinforcer and is not freely emitted. • The activity is reinforcing because it satisfies the deprivation created. • The animal tries to return to its pre-deprivation level of responding. • Activities can be reinforcing even if their initial baselines were not higher.

Behavioral Allocation • Blisspoint (paired basepoint) – the free operant level of two responses. • Unrestricted responding with two choices of behaviors. • Blisspoint is used to figure out how much behavior an animal will engage in to obtain a reward. • Animals try to get as close to the blisspoint as possible.

Finding the Blisspoint

Contingency Lines for Rewards

Problems with Contingencies • Blisspoint is established by looking at behavior before a contingency is established. • The established contingency must take blisspoint into account or it may not increase desired behavior.

Choice Behavior • Herrnstein’s matching law – describes how animals act when they have two or more choices. • Different responses have different schedules of reinforcement. • Responding to each choice is proportionate to the reinforcement for each choice – after learning. • This can be expressed mathematically.

Mathematical Expression • The formula for the matching law is: where R1 and R2 are the rates of response for two alternative responses And r1 and r2 are rates of reinforcement for those responses

Law Predicts Pecking Behavior

Delayed Gratification • Why does anyone choose a smaller reward part of the time? • Animals and people typically choose a small immediate reward over a larger delayed reward. • Large rewards are selected when: • The choice is made in advance of reward. • Reinforcers are not visible or reward is already present (pleasurable activity).

Complexities of the Matching Law • Maximizing law – sometimes the aim is to obtain as many rewards as possible. • Explains FR-10 vs FR-40 schedules. • Doesn’t work for VI vs VR schedules. • Momentary maximization theory – choose best alternative at the time. • Delay reduction theory – choose what will get the reward the fastest.

Aversive Theories: Explaining Avoidance • The existence of avoidance behavior implies a cognitive process: • Behaving in order to prevent an aversive event. • Behaviorists like Hull needed to explain this without cognition. • Mowrer’s two-factor theory was developed to explain this – but it has problems needing explanation.

Mowrer’s Two-Factor Theory • Mowrer proposed a drive-based two-factortheory to avoid explaining avoidance using cognitive (mentalistic) concepts. • Avoidance involves two stages: • Fear is classically conditioned to the environmental conditions preceding an aversive event. • Cues evoke fear -- an instrumental response occurs to terminate the fear.

Mowrer’s View (Cont.) • We are not actually avoiding an event but escaping from a feared object (environmental cue). • Miller’s white/black chamber – rats escaped the feared white chamber, not avoided an anticipated shock. • Fear reduction rewards the escape behavior.

Criticisms of Two-Factory Theory • Avoidance behavior is extremely resistant to extinction. • Should extinguish with exposure to CS without UCS, but does not. • Levis & Boyd found that animals do not get sufficient exposure duration because their behavior prevents it. • Avoidance persists if long latency cues exist closer to the aversive event.

Is Fear Really Present? • When avoidance behavior is well-learned the animals don’t seem to be afraid. • An avoidance CS does not suppress operant responding (no fear). • However, this could mean that the animal’s hunger is stronger than the fear. • Strong fear (drive strength) is not needed if habit strength is large.

Avoidance without a CS • Sidman avoidance task – an avoidance response delays an aversive event for a period of time. • There is no external cue to when the aversive event will occur – just duration. Temporal conditioning. • How do animals learn to avoid shock without any external cues for the classical conditioning of fear?

Kamin’s Findings • Avoidance of the UCS, not just termination of the CS (and the fear) matters in avoidance learning. • Four conditions: • Response ends CS and prevents UCS. • Reponse ends CS but doesn’t stop UCS. • Response prevents UCS but CS stays. • CS and UCS, response does nothing (control condition).

Both Factors are Important Termination and Avoidance both show greater learning

D’Amato’s Acquired Motive View • D’Amato proposed that both pain and relief motivate avoidance. • Anticipatory pain & relief responses. • Shock elicits unconditioned pain response RP and stimulus SP motivates escape. • Classically conditioned cues sP elicit anticipatory pain response rP that motivates escape from the CS.

Anticipatory Relief Response • Termination of the UCS produces an unconditioned relief response RR with stimulus consequences SR. • Conditioned cues elicit an anticipatory relief response rR with stimulus consequences sR. • Example: dog bite elicits pain response, sight of dog elicits anticipatory pain, house elicits relief

A Discriminative Cue is Needed • During trace conditioning no cue is present when UCS occurs and no avoidance learning occurs. • A second cue presented during avoidance behavior slowly acquires rR-sR conditioning. • Similarly, in a Sidman task, cues predict relief -- associated with avoidance behavior, not the UCS.

A Second Cue Helps Trace Learning Group TS saw a second cue associated with termination of shock

Thorndike’s Negative Law of Effect • Thorndike suggested that punishment weakens an S-R bond. • Skinner’s finding that suppression of behavior is temporary contradicts this. • The effect of punishment must be something different than weakening of the S-R bond.

Guthrie’s View of Punishment • When punishment occurs, the response to it is conditioned to the environment during the event. • Freezing, jumping, flinching. • The effect on behavior depends on the UCR elicited by the shock. • Shock to forepaws inhibits running but a shock to hindpaws facilitates it. • Monkeys struggle more when shocked.

Guthrie’s Competing Response Theory • Guthrie suggested that punishment works only if the response elicited by the punishment is incompatible with the punished behavior. • Gerbils punished for standing upright do it more, not less.

Problems with Guthrie’s Theory • Response competition alone is insufficient to make punishment effective. • When punishment is contingent instead of just co-occurring, it is more effective. • Contingent means the punishment happens only when the behavior occurs, not independent of it, randomly

Este’s Motivational View • When a behavior is rewarded, the motivational system becomes associated with the behavior. • The response occurs the next time the motivational system is activated. • Punishment works by changing the motives. • Stimuli associated with punishment inhibit the motivational state.

Support for Estes • Thirsty rats were trained to lever press for water and “dry lick” for air on alternate days. • Punishment of both behaviors had a greater effect on dry licking (a thirst-related behavior) than lever pressing. • If the behavior rather than the motive were being suppressed no such difference should occur. • Results differed with hungry rats.

PSY402 Theories of Learning