340 likes | 483 Views
Comparing Training Methods. Evidence for and against the advantages of clicker training. Clicker training is misunderstood and misused. Assume that just use the clicker Not realize that have to condition the click: is a conditioned reinforcer, not a primary reinforcer
E N D
Comparing Training Methods Evidence for and against the advantages of clicker training
Clicker training is misunderstood and misused • Assume that just use the clicker • Not realize that have to condition the click: is a conditioned reinforcer, not a primary reinforcer • Have to add a cue for the response • Clicker marks the behavior, not elicits it! • Fade the clicker as behavior becomes fluent • No longer need the marker • In a chain, a cue becomes a conditioned reinforcer
Cueing Basics: • A fluent cue response is: • Precise: animal performs the behavior exactly as you hadenvisioned • Performed with low latency • Performed with optimal speed • Shows resistant to distraction • Performed at any distance from the handler • Performed for the duration required by the handler
Example of a chain: • Each cue serves as: • A reinforcer for the prior behavior • A cue for the next behavior • A cue to keep going
Note that with a behavior chain • Several responses are emitted before a C/T is given • HOW the behaviors are taught is what is different • Shape each step • Often backward chain: start from end and work backward • Work towards • Accuracy • Fluency
Very few experimental reports on the value animal training • Just assume it works • Lots of incidental evidence • Does clicker training work? Is it better? • Let’s look at 2 studies
Thorn, et al. 2006Training shelter dogs • Shelters are high stress environments • Very little training • Lots of inappropriate behavior inadvertently reinforced • Little time for staff to train • Need for quick but effective training programs • Teach basic manners • Sit is the most basic
Three experiments • Experiment 1: • 3 researchers: Handler, timer, “stranger” • Approach dog: record time to sit/start sit • Used verbal “good dog” and a treat • Experiment 2: • Compared verbal vs. clicker • Verbal: waited, said “good dog” and treat • Clicker: waited, clicked, treat • 2 days of training for each: look at speed of training and retention • Experiment 3: • Compared Contingent vs. noncontingent reward • 3 days of 15 minute training with food /verbal reward for sit • 4 post training conditions: • Same trainer, same room • Different trainer, same room • Same trainer, different room • Different trainer, different room
Results • Latency to sit significantly decreased. • All dogs were able to sit within 60 sec
Exp. 1, con’t • Mean latency of second session less than ½ of first session.
Experiment 2 • Day 1 of training: No difference between clicker and verbal training • Significant decrease in latency to sit • Day 2: verbal training appeared better than clicker training • dogs showed better retention • Dogs showed lower latency
Experiment 3 • No differences between groups on first day • Dogs in contingent trials sat significantly longer, and this increased across trials • Was a day x treatment interaction • Dogs in noncontingent condition sat less • Dogs in contingent condition sat more
Experiment 3 • No differences in generalization tests • No difference across four test conditions for the dogs in the contingent reward condition • Shows that the dogs generalized the “sit” across new settings and new trainers
Summary: • Data show that minimal training in shelter (10-15 minutes over 2 days) works! • Even with novice trainers • Dogs quickly learned sit • Able to retain new response • Why verbal work better than clickers? • Limited supply of verbal reward in shelter, so more valuable? • Negative association or no conditioning to clicker • Incomplete conditioning of dogs to clicker • Other factors?
Summary: • Shelter implemented new training policy: • All staff required to have dog sit when moving dog, feeding dog, interacting with the dog • Dog exposed to continuous training across settings • Found that other behaviors also affected: • Decreased inappropriate responses such as barking, stress responses, jumping on cage • Better response to the dogs; increased adoption probability
Ferguson and Rosalez-Ruiz (2001) • Trailer loading = critical horse behavior • Horses often not like trailers • Small, dark, confined • Aversive methods often (usually) used • Too much negative reinforcement and punishment, which often escalates to increasingly aversive treatment
Horse Loading Behaviors • Back up • Move Forward • Turn left/right • Step up • Loading problems = leading problems • Horse not going where led • Balks, turns away, etc.
Method • 5 horses with poor loading history • 2-horse straight load step-up trailer • Butt chain instead of butt bar • Side windows and rear doors left open • White inside and outside • Railroad tie used as extension of trailer deck • Target: Red pot holder • Reinforcers = typical horse treats • 15 min training sessions
Method • Baseline compared to training • Loading behavior chain: horse approach trailer: • Within 3 meters (about 10 feet) • Within 1.5 meters • With head/neck in trailer • With front legs in trailer • With ½ body in trailer • With 3 legs in trailer • With 4 legs in trailer; less than 5 sec • 4 legs in, allowed butt chain to be fastened, door to be closed.
Behaviors Recorded: • Inappropriate responses/stress responses: • Amount of horse in trailer (using 8 step chain) • Freezing • Head toss • Standing • Turning • Loading: • Getting into trailer (less than 5 sec) • Loading and staying in trailer • Number of prompts • New leads (re-approaches) • Latency to respond to cue: • Within 5 sec • Greater than 5 sec • No response • Also obtained interobserver agreement (IOA)
Procedure: • Baseline: 1 day of repeated 5 min baselines • Target training: 2 days; 20-30 trials/day • Touch target • Criteria: 80% of prompts • Trailer training: • Trials to touch (just inside trailer) • Upon entry, lead back to start and another trial • Started at each horse’s baseline distance • Added then faded trailer extension • Trained to load on left/right sides • Added limited hold with Fancy: gave several steps to move forward • Multiple baseline design across horses
Results • All horses learned to target during first training session • First session: average of about 60% accuracy • Second session: average of 80% • Sammy took 3 sessions, but reached 90%
Red’s data: • Red shaped quickly: • reached criteria at each step of the chain before moving on to next • Change in setting disrupted behavior, but recovered criteria quickly
All horses able to be loaded • Baseline: no horse able to get beyond step 4 • After initial target training: • Red: step 6, some 7 • Penny/Shadow: steps 5,6 mastered; some 7,8 • Sammy: steps 3 to 5 mastered; some 6-8 • Fancy: 5 to 6 mastered • When added extension: all but Fancy reached criterion performance • Fancy outwitted researchers: could stretch to touch target even with extension • Had to add the limited hold condition to outwit him.
Combined horse data • All horses maintained loading behaviors when extension removed • Loading left and right and new trailers produced some disruption but quickly recovered • All reached 90%
Inappropriate Responses • Most common: • Standing • Turning • Head toss • Immediately decreased with training • Note: not targeted • Suggests these are stress responses
Leads and Prompts • During baseline: Few leads and LOTS of prompts • During training: • Fewer prompts • Leads were about 1:1 with prompts
Summary • Why is clicker training better? • Faster responding with fewer disruptive responses • Fewer avoidance responses • Happier horses and trainers • Changes in loading procedures (e.g., left vs. right position, new trailer) easily dealt with and overcome
Summary • Target training decreased inappropriate responses secondary to increasing trailer loading • Target training established stimulus control • This allowed stimulus control during situations which usually elicited problem behaviors • Horses so busy focusing on target that they ignored poisoned cues
Conclusions • Did the Thorn study show that clicker training was not as good, or that in limited use with naïve users it was less effective? • Did Thorn, et al., measure inappropriate responses or the effects of the two training methods? • Which do YOU think is more effective, and why?