1 / 24

John Klaric 1,2

The North Carolina Online Computer Skills Assessment: Relationships between Item Response Times and Item Residuals – or – Can Item Response Times Tell Us Anything about the Probability of a Correct Response?. John Klaric 1,2

tamar
Download Presentation

John Klaric 1,2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The North Carolina Online Computer Skills Assessment:Relationships between Item Response Times andItem Residuals – or –Can Item Response Times Tell Us Anything about the Probability of a Correct Response? John Klaric1,2 1Department of Educational Research Methodology, The University of North Carolina at Greensboro 2NC Department of Public Instruction: Accountability Services Division Accountability Conference, 2009

  2. Purpose and Significance • Purpose • Accountability Services at DPI faces a daunting task: • The Testing Policy and Operations team and the Test Development team concentrate on developing policies and tests that assess student proficiency in a number of content areas. • Various groups analyze student test results – results that are used by many stakeholders. These results are also used for federal and state reports as officials make data-driven policy decisions. • What if the tempo, or pace, with which students respond – correctly or incorrectly – is a stable student characteristic that is informative about that student’s proficiency?

  3. Nuisance Variables • These factors can influence the variable of most interest – and DPI is most interested in measures of student proficiency. • Speed-accuracy tradeoff (van der Linden, 2005) • Proficient students who are slow to respond during a test can be penalized, compared to those at the same proficiency who respond quickly. • Potential Significance: consider the chemical viscosity of O-rings, such as those used in mechanical applications. Temperature is here a nuisance variable – it isn’t of much interest. But differences in temperature can have devastating impacts.

  4. Research Methods:The NC Online Computer Skills Assessment (OCSA, 3rd Edition Administered Fall, 2005) • Because it is a computer-based assessment, examinee actions with the mouse and/or keyboard can be captured accurately. • Response times: length of item presentation vs “time to overt response” • 2 datasets built from the Fall 2005 data: • “Complete” dataset, containing item responses and times from all students taking the exam. • “Time Truncated” dataset – some slower-responding examinees were systematically excluded (about 2000, taking longer than roughly 2 hours to complete the test)

  5. Research Methods:The NC Online Computer Skills Assessment (OCSA, 3rd Edition Administered Fall, 2005) • Test Description • Computer-based (non-adaptive) assessment • 54 items in length: approximately half, multiple-choice (MC) items with 4 response options; remainder, performance-based arranged in problem-based item sets • 0/1 scoring procedures • Fall 2005 Administration • 8 forms spiralled within schools, administered to over 100,000 8th graders • Motivation: NC graduation requirement Source: North Carolina Department of Public Instruction, 2008.

  6. Research Methods:The NC Online Computer Skills Assessment (OCSA, 3rd Edition Administered Fall, 2005) • 6 content-related strands: Societal/Ethical Issues (12-14%) Spreadsheet (22-25%) Multimedia & Presentation (10-12%) Database (22-25%) Keyboarding/Word Processing/ Desktop Publishing (18-20%) Telecommunications and Internet (10-12%)

  7. Figure 1. Total Test Response Times, Complete Dataset (N=105917): Fall 2005, NC Online Computer Skills Assessment

  8. Figure 2. Total Test Response Times, Time-Truncated Dataset (N=103751): Fall 2005, NC Online Computer Skills Assessment

  9. Figure 3. Total Test Score, Time Truncated Dataset: Fall 2005, NC Online Computer Skills Assessment

  10. Table 1. Statistical Moments for Total Test Scores during the Fall 2005 Administration of NC’s Online Computer Skills Assessment N Mean Standard Deviation Skew-ness Kurtosis Complete dataset 105917 28.2 10.65 -0.12 -0.77 Truncated dataset 103751 28.3 10.66 -0.13 -0.77 Note: Means and standard deviations are from sums of dichotomized item scores (0=incorrect, 1=correct) across all items where a response was made. Items with missing responses are excluded. Comparison of Score Distributions from Complete vs Time Truncated Data

  11. Classical Item Statistics from Edition 3 of the NC Computer Skills Assessment (Time Truncated Data): Fall 2005 Administration

  12. Item Summary Statistics from a 3-Parameter Logistic IRT Model: Fall ’05 NC Computer Skills Assessment, Time Truncated Data

  13. Figure 4. Response Times by Score, Item 7 (Time Truncated Data): Fall 2005, NC Online Computer Skills Assessment

  14. Figure 5. Response Times by Score, Item 9 (Time Truncated Data): Fall 2005, NC Online Computer Skills Assessment

  15. Figure 6. Response Times by Score, Item 14 (Time Truncated Data): Fall 2005, NC Online Computer Skills Assessment

  16. Figure 7. Response Times by Score, Item 23 (Time Truncated Data): Fall 2005, Online Computer Skills Assessment

  17. Figure 8. Response Times by Score, Item 45 (Time Truncated Data): Fall 2005, NC Online Computer Skills Assessment

  18. A Relationship between Error andResponse Time? Item #1: Components of Potential Interest A B Variability in item error when estimating probability of a correct response. Variability in item response time.

  19. Non-zero Correlation between Error and Response Time A B A: Variance in Item Residual Error B: Variance in Item Response Time Intersection: Portion of A explained by B – quantified by a “semi-partial correlation”

  20. Summary of the NC Online Computer Skills Assessment and Ongoing Studies • Apparently, little intersection between residual error and response time (see table on next slide) • Good news for the NC testing program: the OCSA appears to primarily measure student proficiency. Proficiency estimates are not highly influenced by response time measures when: • 0/1 responses are calibrated with a unidimensional IRT model, and • Calibration is performed with sufficiently informative priors on the IRT c-parameter.

  21. Correlations expressing relationships between residual errors and item response times

  22. Ongoing Studies • Simulation studies are being conducted to see if this intersection can be detected, and under what circumstances. Whether possible intersections impact estimates of student proficiency as shown by bias and RMSE statistics is also being studied.

  23. Selected References • Lord, F.M. (1980). Applications of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. • Luecht, R.M. (2008). MIRTGEN 2.0 with Response Times. Greensboro NC: University of North Carolina at Greensboro. • North Carolina Department of Public Instruction. (2008). Test of computer skills (Graduation – Requirement) [Electronic Version]. Retrieved September 12, 2008 fromhttp://www.dpi.state.nc.us/accountability/testing/computerskills/ • Thissen, D. (1983). Timed testing: An approach using item response theory. In D.J. Weiss (ed.), New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing. New York, NY: Academic Press. • van der Linden, W.J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181-204. • Wise, S.L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18, 163-183.

  24. Acknowledgements Many thanks are owed to many people. Here are a few: Dr. Ric Luecht (UNCG, ERM) Dr. Terry Ackerman (UNCG, ERM) Dr. Lou Fabrizio (NCDPI, Accountability Services) Dr. Gary Williamson (NCDPI, Accountability Services) Dr. Laura Kramer (NCDPI, Test Development) Dr. Wim van der Linden (U. Twente)

More Related