170 likes | 692 Views
Lecture 4: Operant Conditioning/Reinforcement Theory. Dr. Arra. Reinforcement Theory and Learning. Reinforcement contingencies associated with a behavior have a lot to do w/ whether the organism actually performs the behavior; organisms do not just avoid pain and seek pleasure
E N D
Lecture 4: Operant Conditioning/Reinforcement Theory Dr. Arra
Reinforcement Theory and Learning • Reinforcement contingencies associated with a behavior have a lot to do w/ whether the organism actually performs the behavior; organisms do not just avoid pain and seek pleasure • Thorndike: Law of effect- learning only occurs if there is reinforcement - refuted; learning does occur w/out reinforcement (Tolman > Latent Learning)
Reinforcement Learning based on Rational Behavior Theory • Organisms learn behavior consequences from exploring the environment • Organisms tend to behave rationally using the contingencies they have learned in the environment; they select the behavior that creates the best state of affairs for them • Rational behavior does not imply conscious deliberation on the organisms part • Simple associative mechanisms (e.g., S>R, R>S) often produce highly adaptive behavior
Operant Conditioning (Skinner, 1938) • Basic Principle of Operant Conditioning- a response followed by a reinforcer is strengthened and is therefore more likely to occur again • Reinforcer- a stimulus that increases the frequency of a response it follows - Must follow the response, Must follow immediately (contiguity, tied to response), Must be contingent on the response • Primary and Secondary Reinforcers
Kinds of Reinforcing Stimuli • Tangible – food, toys, stickers • Social – smile, hug, pat on back, verbal • Activity – opportunity to engage in favorite activity (Premack) • Intrinsic – engage in activities for own personal good feelings • Extrinsic – behavior is externally motivated; engage in activities for approval (to please) of others
Operant Conditioning - Terms • Operant – voluntary response that has an effect on the environment • Free operant level – frequency of an operant in the absence of reinforcement (Baseline) • Target behavior – behavior that you wish to change • Terminal behavior – what you want outcome behavior to look like (form, frequency)
Operant Conditioning - Terms • Extinction – do not reinforce behavior, behavior returns to baseline level • Extinction Burst – brief increase in behavior before it extinguishes • Shaping- differential reinforcement of successive approximations of a desired behavior • Chaining- specific sequence of responses, each associated w/ a particular stimulus - task analysis, B>R, BB>R, BBB>R - total task presentation: training on each step of the task analysis during every session
Reward and Punishment • Positive reinforcement- stimulus presented after a response increases the likelihood of the behavior reoccurring • Negative reinforcement – removal of an (aversive, unpleasant) stimulus increases response • punishment- an event that suppresses a behavior; acts immediately; behaviors can reappear when punishment stops or punisher is absent - 2 types: present an aversive (verbal, physical) or withdraw a positive event (timeout, response cost) - pros and cons of punishment
Reward and Punishment • Factors that effect punishment: - delay between response and punishment - severity of punishment (stronger=suppression) - consistency of use
Premack’s Theory of Reinforcement (Premack Principle) • Premack (1959) • All behaviors have value to the organism • Therefore, more valued behavior reinforces a less valued behavior • Thus, more desirable behaviors are made contingent upon less desirable behaviors • Example: classroom
Schedules of Reinforcement • Continuous vs. Intermittent • Ferster & Skinner (1957) • 4 basic schedules: • Fixed Ratio – reinforcer presented after a set amount of responses • Variable Ratio – reinforcer presented after a varying # of responses • Fixed Interval – reinforcement contingent on 1st response emitted after a fixed time interval has elapsed • Variable Interval – reinforcement contingent on 1st response emitted after varying time interval
Differential Reinforcement • Strengthen alternate behaviors to replace inappropriate behaviors • Shaping: reward one, punish another • D.R.I. (incompatible) Behaviors: if seeking to reduce out-of-seat behaviors, reinforce in-seat behaviors • D.R.A. (alternate) Behaviors: if seeking to reduce sleeping in class, reinforce academic work
Differential Reinforcement • D.R.O. (other) Behaviors: reinforcement for any behavior except for a certain response Ex: everyone who does not talk during movie gets a treat - not talking is being reinforced • D.R.L. (low rates of responding) Behaviors: reinforce 1st response after a time period in which no response has occurred Ex: reinforce question asking after a period of quiet working • D.R.H. (high rates of responses) Behaviors: reinforce only after large # of responses Ex: the more my dog begs, the greater likelihood of a reinforcer
Differential Reinforcement When using D.R. be sure to consider: • When choosing alternate/incompatible responses choose behaviors that the child regularly engages in • Consequences must be reinforcing • Continuous > Intermittent Schedule - baseline> increase intervals over time • Use with other reductive procedures
Antecedent Stimulus • Do not elicit response; rather set occasion for a response to be reinforced • Discriminative stimulus: when an antecedent stimulus influences the likelihood that a response will occur • Stimulus control: you are under stimulus control when a particular response is acceptable under certain conditions - bell rings > class leaves
Antecedent Stimulus • Stimulus Generalization: when an organism responds to a range of stimuli; based on initial pairing • Generalization gradient: tendency for organisms to generalize more readily as stimuli become more similar to the discriminative stimulus
Antecedent Stimulus • Stimulus Discrimination • Learning when a response will and will not be reinforced • Process of learning the conditioned response R>S • Ex: child sees a baseball bat and calls it a bat: reinforced child sees a bucket and calls it a bat: no reinforcement - If this is done, child should eventually learn to discriminate between bat and bucket