220 likes | 362 Views
Overview of New State Data Forensics Analysis March 2011. Outline of Presentation. Data Forensics (DF) Process FDOE DF Goals DF Tools & Methods Spring 2011 DF Program Conservative thresholds Students & Schools Summary Q&A. Caveon Data Forensics™ Process. Analyses of test data
E N D
Outline of Presentation • Data Forensics (DF) Process • FDOE DF Goals • DF Tools & Methods • Spring 2011 DF Program • Conservative thresholds • Students & Schools • Summary • Q&A
Caveon Data Forensics™ Process • Analyses of test data • First building a “model” of typical question responses • Identify unusual behaviors with potential of unfair advantage
Caveon Data Forensics Process (continued) • Examples of “Unusual” Behavior • Very high agreement among pairs or groups of test takers • Very unusual number of erasures, particularly wrong to right • Very substantial gains or losses from one occasion to another
Overview of the Use of Data Forensics • Many high-stakes testing programs now using Data Forensics • Standards for testing, e.g., “CCSSO’s Operational Best Practices for State Assessment Programs” • Essential to act on the results
FDOE Data Forensics Goals • Uphold fairness and validity of test results • Identify risks and irregularities • Take action based on data and analysis • “Measure and Manage” • Communicate zero tolerance for cheating
Testing Examiner’s Role • Ensure (and then certify) the test administration is fair and proper • Declare scores invalid when fairness and validity are negatively impacted • Absolute due diligence when proctoring a test • Administering or proctoring a test is not a passive activity!
Forensic Tools and Methods Similarity: answer-copying, collusion Erasures: tampering Gains: pre-knowledge, coaching Aberrance: tampering, exposure Identical tests: collusion Perfect tests: answer key loss
Similarity Our most powerful & “credible” statistic Measures degree of similarity between 2 or more test instances Analyze each test instance against all other test instances in the test Probable causes of extremely high similarity: Answer copying Test coaching Proxy test taking Collusion
Erasures Based on estimated answer changing rates from: Wrong-to-Right Anything-to-Wrong Find answer sheets with unusual WtR answers Extreme statistical outliers could involve tampering, “panic cheating”, etc.
Unusual Gains/Losses Predict score using prior year information Measure large score increases/decreases against predicted score Which score truly reflects the student’s actual ability or competence? Extreme gains/losses may result from: Pre-knowledge Coaching Student development—visual acuity
Spring 2011 Data Forensics Focus on two groups Student-level School-level Utilize VERY conservative thresholds
A quick discussion of conservative thresholds… Chance of being hit by lightning = 1 in a million Chance of winning the lottery = 1 in 10 million Chance of DNA false-positive = 1 in 30 million Chance of tests being flagged and taken independently = 1 in a TRILLION
Student-Level Analysis Similarity Analysis only Most credible Chance of tests being so similar, and taken independently = 1 in a trillion Invalidate test scores beyond 1012 Fairness and validity of test instance must be questioned Appeals process to be implemented
Example: 2010 9th Grade Cluster • Identifies apparent student collusion • Definitions • “Dominant” = same answer selected by majority of group members • “non-Dominant” = different answer selected by majority of group members • Example of 2 students that passed, but not independently • i.e., they didn’t do their own work
School-Level Analysis Similarity, gains, and erasures Flagged schools conduct local review Extreme instances may prompt formal investigations and sanctions
Benefits of Conservative Threshholds • Focus on most egregious instances • Provides results that are • Explainable • Defensible • Can move later to different thresholds • Easier to manage • Walk before we run
Program Results Monitored behavior improves Invalidations deter cheating
Summary • Goal: Fair and valid testing for all students • FDOE will conduct Data Forensics on FCAT/FCAT 2.0/EOC test data • Focus on • Individual students—extremely similar tests • Schools—similarity, gains, and erasures