1 / 22

Overview of New State Data Forensics Analysis March 2011

Overview of New State Data Forensics Analysis March 2011. Outline of Presentation. Data Forensics (DF) Process FDOE DF Goals DF Tools & Methods Spring 2011 DF Program Conservative thresholds Students & Schools Summary Q&A. Caveon Data Forensics™ Process. Analyses of test data

tibor
Download Presentation

Overview of New State Data Forensics Analysis March 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of New State Data Forensics Analysis March 2011

  2. Outline of Presentation • Data Forensics (DF) Process • FDOE DF Goals • DF Tools & Methods • Spring 2011 DF Program • Conservative thresholds • Students & Schools • Summary • Q&A

  3. Caveon Data Forensics™ Process • Analyses of test data • First building a “model” of typical question responses • Identify unusual behaviors with potential of unfair advantage

  4. Caveon Data Forensics Process (continued) • Examples of “Unusual” Behavior • Very high agreement among pairs or groups of test takers • Very unusual number of erasures, particularly wrong to right • Very substantial gains or losses from one occasion to another

  5. Overview of the Use of Data Forensics • Many high-stakes testing programs now using Data Forensics • Standards for testing, e.g., “CCSSO’s Operational Best Practices for State Assessment Programs” • Essential to act on the results

  6. FDOE Data Forensics Goals • Uphold fairness and validity of test results • Identify risks and irregularities • Take action based on data and analysis • “Measure and Manage” • Communicate zero tolerance for cheating

  7. Testing Examiner’s Role • Ensure (and then certify) the test administration is fair and proper • Declare scores invalid when fairness and validity are negatively impacted • Absolute due diligence when proctoring a test • Administering or proctoring a test is not a passive activity!

  8. Forensic Tools and Methods Similarity: answer-copying, collusion Erasures: tampering Gains: pre-knowledge, coaching Aberrance: tampering, exposure Identical tests: collusion Perfect tests: answer key loss

  9. Similarity Our most powerful & “credible” statistic Measures degree of similarity between 2 or more test instances Analyze each test instance against all other test instances in the test Probable causes of extremely high similarity: Answer copying Test coaching Proxy test taking Collusion

  10. Erasures Based on estimated answer changing rates from: Wrong-to-Right Anything-to-Wrong Find answer sheets with unusual WtR answers Extreme statistical outliers could involve tampering, “panic cheating”, etc.

  11. Unusual Gains/Losses Predict score using prior year information Measure large score increases/decreases against predicted score Which score truly reflects the student’s actual ability or competence? Extreme gains/losses may result from: Pre-knowledge Coaching Student development—visual acuity

  12. Spring 2011 Data Forensics Focus on two groups Student-level School-level Utilize VERY conservative thresholds

  13. A quick discussion of conservative thresholds… Chance of being hit by lightning = 1 in a million Chance of winning the lottery = 1 in 10 million Chance of DNA false-positive = 1 in 30 million Chance of tests being flagged and taken independently = 1 in a TRILLION

  14. Student-Level Analysis Similarity Analysis only Most credible Chance of tests being so similar, and taken independently = 1 in a trillion Invalidate test scores beyond 1012 Fairness and validity of test instance must be questioned Appeals process to be implemented

  15. Example: 2010 9th Grade Cluster • Identifies apparent student collusion • Definitions • “Dominant” = same answer selected by majority of group members • “non-Dominant” = different answer selected by majority of group members • Example of 2 students that passed, but not independently • i.e., they didn’t do their own work

  16. Impact of “1 in a Trillion” Threshold, Math & Reading 2010

  17. School-Level Analysis Similarity, gains, and erasures Flagged schools conduct local review Extreme instances may prompt formal investigations and sanctions

  18. Benefits of Conservative Threshholds • Focus on most egregious instances • Provides results that are • Explainable • Defensible • Can move later to different thresholds • Easier to manage • Walk before we run

  19. Program Results Monitored behavior improves Invalidations deter cheating

  20. Summary • Goal: Fair and valid testing for all students • FDOE will conduct Data Forensics on FCAT/FCAT 2.0/EOC test data • Focus on • Individual students—extremely similar tests • Schools—similarity, gains, and erasures

More Related