1 / 21

CADIgen: A New Approach to Skills Diagnosis Data Simulation

CADIgen: A New Approach to Skills Diagnosis Data Simulation. Final Presentation Becky Norman•Abdullah Ferdous•Louis Roussos. Background: What is Skills Diagnosis?. Predefined list of skills (attributes) on assessment Each item requires the mastery of one or more skills

chase-dyer
Download Presentation

CADIgen: A New Approach to Skills Diagnosis Data Simulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CADIgen: A New Approach to Skills Diagnosis Data Simulation Final Presentation Becky Norman•Abdullah Ferdous•Louis Roussos

  2. Background:What is Skills Diagnosis? • Predefined list of skills (attributes) on assessment • Each item requires the mastery of one or more skills • Q matrix specifies whether skill required for item • Outcome is a diagnostic report • List of each skill, whether student is a master or non-master • Report can be used to aid instruction

  3. Background:Fusion Model Characteristics • Skill space includes Q matrix and additional relevant attributes • Conjunctive • Assumes Attribute Homogeneity • Fusion Model Attribute Response Function (ARF) is step function • Attribute location or jump-point (kk)

  4. Background:Attribute Heterogeneity • Pks : The proportion of masters for a given attribute • Piks: The proportion of masters for a given item associated with an attribute • Attribute heterogeneity occurs when we allow the individual item associated with an attribute to vary. Homogeneity is the case which each item associated with the skill has the same proportion of masters. • DADIgen allows one to specify the degree of heterogeneity to allow

  5. DADIgen • Dichotomous attributes, Dichotomous items generation • pk estimates based on dichotomous piks. • Generates data based on user specifications • Input files • qmatrix.in, xpar.in, iparms.in, xpc.in, xrstar.in, corrparm.in, corr.txt, pikparms.in, xpik.in, xpk.in • Output files • data.txt, alfcthet.out, alphad.out Roussos, Xu, & Stout (2003)

  6. Research Questions: • How does increasing heterogeneity affect the shape of the score distribution of simulated data using the DADIgen program? • Does a modified program, simulating item correct probabilities based on a continuous ability (CADIgen), produce more realistic score distributions? • How does CADIgen simulated score distributions compare to distributions estimated by applying the homogeneous fusion model to the CADIgen simulated data?

  7. Methods: DADIgen • qmatrices: • Low complexity: average of 1.5 items per attribute • High complexity: average 2.5 items per attribute • Examinees: 10,000 • Range of π*: .75 - .95 • Attribute difficulty parameter • Range of r*: .40 - .85 • Attribute discrimination parameter • pks: .45, .49, .53, .57, .61, .65, .69 • Range of correlation between skills: .55 - .89

  8. DADIgen Score Distributions

  9. Score Distributions: Low Complexity Heterogeneity = 0.00 Heterogeneity = 0.50

  10. Score Distributions: High Complexity Heterogeneity = 0.00 Heterogeneity = 0.50

  11. Methods: CADIgen • Continuous attributes, dichotomous items, generation • Probabilities based on continuous attribute ability parameters and continuous attribute application functions

  12. CADIgen Score Distributions

  13. Score Distributions: Low Complexity Heterogeneity = 0.50 Heterogeneity = 0.00

  14. Score Distributions: High Complexity

  15. BackgroundResearch Question 3 • Jang (2005) examined the relationship between observed cumulative score frequencies and estimated distributions on 39 item test. • Score<10: Observed slightly less than estimated • Middle range: Observed slightly more • Score>30: Observed slightly less

  16. Methods: Research Question 3 • Use CADIgen score matrices • Low complexity, heterogeneity = .50 • High complexity, heterogeneity = .25 • Run existing skills diagnosis estimation program • Obtain r* and π* estimates and specify in DADIgen • Compare simulated score distributions to score distributions based on fitting the fusion model to CADIgen simulated data

  17. Simulated vs. Generated Score Distribution: Low Complexity

  18. Simulated vs. Generated Score Distribution: High Complexity

  19. Summary and Conclusions • DADIgen tends to produce bimodal distributions • Most pronounced in high condition, and with less heterogeneity • Using a continuous attribute ability, CADIgen eliminated the bimodality • CADIgen produced similar score distributions to those obtained in real data situations for the low condition, somewhat for the high

  20. Next Step • Calculate and output expected homogeneous r*’s and π*’s in CADIgen using heterogeneous r*’s and π*’s • Use for comparison with existing skills diagnosis program estimates

  21. Any Questions?

More Related