650 likes | 751 Views
Valued Evidence: Some Issues in the Evaluation of Behavioral Practices. Philip N. Chase Cambridge Center for Behavioral Studies. Acknowledgements:. Dan Bernstein, University of Kansas Chapter to appear in Greg Madden, Ed., APA Handbook of Behavior Analysis Former students:
E N D
Valued Evidence: Some Issues in the Evaluation of Behavioral Practices Philip N. Chase Cambridge Center for Behavioral Studies
Acknowledgements: • Dan Bernstein, University of Kansas Chapter to appear in Greg Madden, Ed., APA Handbook of Behavior Analysis • Former students: Chata DicksonVennessa WalkerHarold LoboAndrew Lightner, and • Bob Collins,
Been There Before! www.creighton.edu www.tangischools.org
Personalized System of Instruction http://www.aliciapatterson.org/
The Effectiveness of a Lecture… www.2blowhards.com
Beyond the Methods • It is not just a matter of good methods for gathering evidence, • The evidence has to matter
Behavior analytic solutions are only as good as their methods of gathering evidence, but both the methods and the evidence itself are only as good as their importance to the culture. www.businessinnovationinsider.com
Experimental Analysis • Criteria for a good experiment: • Reliability. Can the results be reproduced? • Internal validity. Have plausible alternative reasons for the results been eliminated? • External validity. Can the treatment produce the results under other conditions? • Comparison. Is there a difference between this condition and that condition?
Experimental Analysis • Criteria for a good experiment: • Reliability. Can the results be reproduced? • Internal validity. Have plausible alternative reasons for the results been eliminated? • External validity. Can the treatment produce the results under other conditions? • Comparison. Is there a difference between this condition and that condition?
The Gold Standard: Random Controlled Trials upload.wikimedia.org
Random Controlled Experiments • The variables that might threaten internal and external validity are assumed to be randomly distributed across the students in the study • Dividing into groups allows for comparison
Single Subject Experiments • Reversals • Multiple-baselines • Changing Criterion Designs
Issues • Populations versus Individual Behavior • Priority to Internal Validity • Impracticality
Priority of Internal Validity • Without internal validity, you cannot address external validity • Establish precise control over the variable of interest • And then examine questions of generality
www.williemaerockcamp.org www.newsweek.com
Typical Tests and Other DV’s • Defining terms and concepts • Naming scientists and their findings • ID simple examples • Select from multiple choices
Tests and Other DV’sNeeded • Critical thinking • Creative thinking • Quantitative literacy • Teamwork • Problem solving • Civic engagement • Intercultural competence • Ethical reasoning
External Validity Problems • Seek practices that work successfully for the culture; • If the successes achieved are not valued by the culture, they will not survive. • The collection of evidence congruent with current educational values and goalsof the culture is needed.
Practicality • Student Mobility • Scheduling • Treatment Integrity • Etc.
Markle (1967) • Developmental Testing • Validation Testing • Field Testing
Developmental Testing • Intensive single-subject design • Convenient Student (s) • Frequent Interaction-Queries About the Material • Frequent Evaluation-Quizzing on What Student Can Say and Do
Developmental Testing Answers Questions About: • Communication Problems • Motivation Problems • Learning Problems • Combination Problems
Implementation iPASS Individual Prescription for Achieving State Standards
iPASS(last time I looked) • Grades 1-7 • 20 Units • 142 Chapters • 423 Lessons • Lots of practice, lots of cumulative reviews • Excellent data management/decisions for individual student
MIL Percent Correct on Unit Challenge and Mastery Tests BER Sessions
To evaluate motivation • Weexamine student, teacher, and parent satisfaction surveys • We examine cumulative number of hours logged in over time • We examine cumulative number of units completed over time • We examine preference for two forms of instruction.
Criticism of Developmental Testing • Relies on subjective judgment: the “clinical” skills of the designer or evaluator. • What’s the Big Deal neworganizing.com
The Real Problems • Could the results be an artifact of something else: • Selection Biases • Testing Biases • Simple Test-retest effects • Do the results have generality?
Validation-Beta Testing • Demonstration Phase • Does the curriculum meets its own goals under controlled circumstances? • Representative Students • Good place for an experiment
Miller and Weaver (1976) • Multiple Baseline Achievement Test • An experimental design similar to a multiple baseline across behaviors
BL iPASS Instruction Unit 1 Test Items Percent Correct Unit 2 Test Items Unit 3 Test Items Achievement Tests Hypothetical Data
Addressing Internal Validity • Practical design • Continuous assessments help to handle a lot of threats to internal validity
Examples of Threats • Historical and Maturation threats: Multiple baseline designs make sure that it isn’t the repeated measures themselves that affect performance. • Differential Selection and Testing: No Problem! Subjects are their own controls.
Threats Not Addressed • Selection biases, however, might occur if we do not have representative students • Have we only evaluated students who are prepared to do well? • Have we assured that attrition has not affected whether the students are representative?
In addition we have not necessarily taken care of all testing biases: Have we assured that the tests used are not biased toward the curriculum? Have independent tests, for example achievement tests developed and validated by others, been used? Have we used tests that assess outcomes that are important to the culture?
Problems of External Validity • Evaluate the curriculum with: • Different Students • Different Schools • Different Tests-particularly tests valued by the culture • Markle’s Field Testing • Utilization Phase-Tests of Generality