700 likes | 1.49k Views
Operant Conditioning. What is Operant Conditioning?. Classical v. Operant Conditioning. · Both classical and operant conditioning use acquisition, extinction, spontaneous recovery, generalization, and discrimination.
E N D
Classical v. Operant Conditioning ·Both classical and operant conditioning use acquisition, extinction, spontaneous recovery, generalization, and discrimination. ·Classical conditioning uses reflexive behavior - behavior that occurs as an automatic response to some stimulus that come before the behavior. · Ask: Is the behavior something the animal does NOT control? YES. Does the animal have a choice in how to behave? NO. - Classical conditioning. ·Operant conditioning uses operant or voluntary behavior – voluntary behavior that is shaped by consequences that come after the behavior · Ask: Is the behavior something the animal can control? YES. Does the animal have a choice in how to behave? YES. - Operant Conditioning.
What is it? • operant conditioning is a type of learning whereby the consequences of behavior determine the likelihood that it will be performed again in the future. • Operant conditioning theory proposes that an organism will tend to repeat a behavior = operants that has desirable consequences (such as receiving a treat), or that will enable it to avoid undesirable consequences (such as being given detention). • Furthermore, organisms will tend not to repeat a behavior that has undesirable consequences (such as disapproval or a fine).
In Summary • A learning process in which the likelihood of a particular behavior occurring is determined by the consequences of that behavior. • The frequency will increase if the consequence is reinforcing to the subject. • The frequency will decrease if the consequence is not reinforcing or punishing to the subject.
B.F. Skinner (1904-1990) • Believed that internal factors like thoughts, emotions, and beliefs could not be used to explain behavior. Instead said that new behaviors were actively chosen by the organism • Looked at “Operants” or active behaviors that are used on the environment to generate consequences • Developed the fundamental principles and techniques of operant conditioning and devised ways to apply them in the real world • Designed the Skinner Box, or operant chamber
The Skinner Box https://www.simplypsychology.org/operant-conditioning.html
Operant conditioning as a three-phase model • Skinner believed that virtually all behaviour can be analysed and explained by the relationship between the behaviour, its antecedents (what happens just before it) and its consequences (what happens just after it). • The three-way relationship between these elements and the order in which they occur is called the three-phase model of operant conditioning.
The ABC Model • The three-phase model of operant conditioning has three parts that occur in a specific sequence: • antecedent(A), a stimulus that occurs before the behaviour = discriminative stimulus’ also describes the antecedent (or the condition that influences behavior bypredicting the likely outcome of a behavior). • the behaviour(B) that occurs due to the antecedent • the consequence (C) to the behaviour. This is an environmental event that occurs immediately after the behaviour and has an effect on the occurrence of the behaviour. Skinner argued that any behaviour which is followed by a consequence will change in strength (become more, or less established) and frequency (occur more, or less often) depending on the nature of that consequence (reward or punishment). • .
This is usually expressed as antecedent (A) → behavior (B) → consequence (C) and is therefore sometimes called the A-B-C model of operant conditioning EXAMPLE : Consider a car stopped at a red traffic light at a busy intersection. When the traffic light turns green, the car is driven through the intersection. In this situation, the green traffic light is the antecedent stimulus that prompts the behaviour of gently pressing on the accelerator for the known, likely and desirable consequence of safely travelling across the intersection.
Skinner’s Air Crib:A room fit for a…Baby! To read more on this invention: Click Here!
Reinforcement/Punishment • Reinforcement - Any consequence that increases the likelihood of the behavior it follows • Reinforcement is ALWAYS GOOD!!! • Punishment - Any consequence that decreases the likelihood of the behavior it follows • The subject determines if a consequence is reinforcing or punishing
Positive Reinforcement • Strengthens a response by presenting a stimulus that you like after a response • Anything that increases the likelihood of a behavior by following it with a desirable event or state • The subject receives something they want (added) • Will strengthenthe behavior
Negative Reinforcement • Strengthens a response by reducing or removing an aversive (disliked) stimulus • Anything that increases the likelihood of a behavior by following it with the removal of an undesirable event or state • Something the subject doesn’t like is removed (subtracted) • Will strengthen the behavior • Neg. Rein. Allows you to either: • Escape something you don’t like that is already present (Neg. Rein. By Escape) • Avoid something before it occurs (Neg. Rein. By Avoidance)
Positive/Negative Reinforcement BOTH ARE GOOD THINGS!!!
Types of Punishment • Punishment is the deliveryof an unpleasant consequence following a response, or the removalof a pleasant consequence following a response • An undesirable event following a behavior • Its effect is opposite that of reinforcement – it decreasesthe frequency of behavior
Positive Punishment (Punishment by Application) • Something is added to the environment you do NOT like. • A verbal reprimand/something painful like a spanking. • electric shock for a rat in a Skinner box • having to run extra laps around a basketball court for being late to training • being given extra chores at home for doing something wrong all involve positive punishment.
Negative Punishment (Punishment by Removal) • Something is taken away that you DO LIKE. • taking food away from a hungry rat • not being allowed to join basketball training because you are late • your parents taking away your internet access for doing something wrong
Response cost Negative punishment involves the removal or loss of a stimulus , which weakens the likelihood of a response occurring again. ‘Do something wrong, get something taken away’. e.g. losing your mobile phone as a punishment. The response cost is the thing that is lost (e.g. mobile phone) response cost = removal of any valued stimulus, whether or not it causes the behavior. There is a ‘cost’ for making a ‘response’
EG. For a speeding fine, your money (a valued stimulus) is taken away from you.In addition, the stimulus of money was unlikely to have been the reason (or ‘cause’) for your speeding! So speeding fine is considered to be a response cost, but also negative punishment, as something of value has been taken away. It is a form of punishment because it decreases the likelihood of a behavior occurring. Response cost does not necessarily involve something of monetary value. A loss of a grade or two for late submission of school work is a response cost that can decrease the likelihood of lateness in the future. Similarly, making a rude comment during a conversation might result in the loss of a smile. This would be the response cost if the smile is a valued stimulus.
Reinforcing/Desirable Stimulus Aversive/UnDesirable Stimulus Stimulus is presented or added to animal’s environment… Stimulus is removed or taken away from animal’s environment… Positive (+) Reinforcement Add something you DO LIKE. Behavior Increases Positive (+) Punishment Add something you DO NOT LIKE. Behavior Decreases Negative (-) Punishment TAKES AWAY something you DO LIKE. Behavior Decreases Negative (-) Reinforcement TAKES AWAY something you DO NOT LIKE. Behavior Increases Reinforcement vs. Punishment
The Good Effects of Punishment • Punishment can effectively control certain behaviors if… • It comes immediately after the undesired behavior • It is consistent and not occasional • Especially useful if teaching a child not to do a dangerous behavior • Most still suggest reinforcing an incompatible behavior rather than using punishment
Bad Effects of Punishment • Does not teach or promote alternative, acceptable behavior. • Only tells what NOT to do while reinforcement tells what to do. • Doesn’t prevent the undesirable behavior when away from the punisher in a “safe setting” • Can lead to fear of the punisher, anxiety, and lower self-esteem • Children who are punished physically may learn to use aggression as a means to solve problems.
How is Neg. Reinforcement different from Punishment? • Negative Reinforcement will always increasea behavior • Punishment will always decreasea behavior • Negative Reinforcement is something YOU DO to take away something bad. • Punishment is something DONE TO YOU that is bad and makes you stop doing a behavior.
Stimulus Generalization • In operant conditioning, stimulus generalization occurs when the correct response is made to another stimulus that is similar (but not necessarily identical) to the stimulus that was present when the conditioned response was reinforced. • in our everyday life, we frequently generalize our responses from one stimulus to another. EG our generalizations from past experiences with people, events and situations influence many of our likes and dislikes of new people, events and situations.
Stimulus discrimination • In operant conditioning, stimulus discrimination occurs when an organism makes the correct response to a stimulus and is reinforced, but does not respond to any other stimulus, even when stimuli are similar (but not identical). EG sniffer dogs used by police, customs and border protection officers to find hidden drugs, explosives and other illegal goods.
Extinction • In operant conditioning, the loss of a conditioned behavior when consequences no longer follow it. • The subject no longer responds since the reinforcement or punishment has stopped. • Can you think of an example?
Spontaneous Recovery • As in classical conditioning, extinction is often not permanent in operant conditioning. After the apparent extinction of a conditioned response, spontaneous recovery can occur and the organism will once again show the response in the absence of any reinforcement. The response is likely to be weaker and will probably not last very long. A spontaneously recovered response is often stronger when it occurs after a lengthy period following extinction of the response than when it occurs relatively soon after extinction.
Discriminative Stimulus Operant Response Consequence Effect on Future Behavior Specific environmental stimulus Gas gage on empty Wallet on sidewalk Voluntary behavior Fill car with gas Give Wallet to Security Event that will make the operant response more or less likely to reoccur Avoid running out of gas. Get $50 Reward If reinforcement = more likely to reoccur If punishment = less likely to reoccur Parts of Operant Conditioning
Shaping • Reinforcement of behaviors that are more and more similar to the one you want to occur • Technique used to establish a new behavior
Shaping Principles • Skinner box - soundproof box with a bar that an animal presses or pecks to release a food or water reward, and a device that records these responses. • Shaping - procedure in which rewards, such as food, gradually guide an animal’s behavior toward a desired behavior. • Successive approximations - shaping method in which you reward responses that are ever closer to the final desired behavior and ignore all other responses. • Shaping nonverbal animals can show what they perceive. Train an animal to discriminate between classes of events or objects. • After being trained to discriminate between flowers, people, cars, and chairs, a pigeon can usually identify in which of these categories a new pictured object belongs
Skinner attached some horizontal stripes to the wall which he then used to gauge the dog's responses of lifting its head higher and higher. Then, he simply set about shaping a jumping response by flashing the strobe (and simultaneously taking a picture), followed by giving a meat treat, each time the dog satisfied the criterion for reinforcement. The result of this process is shown below, as it was in LOOK magazine, in terms of the pictures taken at different points in the shaping process. Within 20 minutes, Skinner had Agnes "running up the wall"
For the second shaping demonstration, Skinner trained Agnes to press the pedal and pop the top on the wastebasket. Again, the photographer's flash served as the conditioned reinforcer, and each step in the process was photographed. The results are shown below.
Continuous reinforcement • A schedule of reinforcement in which a reward follows every correct response • Learning occurs rapidly • But the behavior will extinguish quickly once the reinforcement stops. • Once that reliable candy machine eats your money twice in a row, you stop putting money into it.
Partial Reinforcement • A schedule of reinforcement in which a reward follows only some correct responses • Learning of behavior will take longer • But will be more resistant to extinction • Includes the following types: • Fixed-interval and variable interval • Fixed-ratio and variable-ratio
Fixed-Ratio Schedule • A partial reinforcement schedule that rewards a response only after some set number of correct responses • The faster the subject responds, the more reinforcements they will receive. • i.e. piece work: You get $5 for every 10 widgets you make.
Variable-Ratio Schedule • A partial reinforcement schedule that rewards an unpredictable average number of correct responses • High rates of responding with little pause in order to increase chances of getting reinforcement • This schedule is very resistant to extinction. • Vegas Rules! Sometimes called the “gambler’s schedule”; similar to a slot machine or fishing
Fixed-Interval Schedule • A partial reinforcement schedule that rewards only the first correct response after some set period of time • Produces gradual responses at first and increases as you get closer to the time of reinforcement • “Procrastinator Schedule” • Example: a known weekly quiz in a class, checking cookies after the 10 minute baking period.
Variable-Interval Schedule • A partial reinforcement that rewards the first correct response after anunpredictable amount of time • Produces slow and steady responses • Example: “pop” quiz in a class • “Are we there yet?” – ask all you want, doesn’t mean it speeds up when the reinforcement of arriving will happen