Jimmy de la Torre Rutgers, The State University of New Jersey Young-Sun Lee

Estimating Cognitive Diagnosis Models based on Alternative Link Functions using the Generalized DINA Model Framework Jimmy de la Torre Rutgers, The State University of New Jersey Young-Sun Lee Teachers College-Columbia University Yuan Hong Rutgers, The State University of New Jersey

The Generalized DINA model • The G-DINA model is a generalization of the deterministic input, noisy “and” gate model • As with many CDMs, the G-DINA model requires a Q-matrix • The G-DINA model partitions the latent classes into latent groups • Each latent group represents one reduced attribute vector • Each latent group has its own associated probability of success, as in,

1 0.75 0.5 0.25 0 : DINA

1 0.75 0.5 0.25 0 : G-DINA

Link Functions for General Models • Several general models for cognitive diagnosis that is linear in the parameters exist • These models use different link functions • Three will be considered: identity, logit, and log

The G-DINA model is based on the identity link is the intercept is the main effect due to is the interaction effect due to is the interaction effect due to

The log-odds CDM is based on the logit link • The log-CDM is based on the log link

Link Functions for General Models • Several general models for cognitive diagnosis that is linear in the parameters exist • These models use different link functions • Three will be considered: identity, logit, and log • All models in their saturated forms have the same number of parameters: • Several interpretable models can be shown to be special cases of these general models

Estimation and Weight and Design Matrices • Using EM algorithm, the parameters of the G-DINA model are estimated as • Weight matrix W is a diagonal matrix that accounts for the differential sizes of the latent groups with as the lth diagonal entry • The design matrix M can be used to specify different models under the identity link • Parameters of these models are estimated as

Example: M for saturated model when

M for the A-CDM when

1 0.75 0.5 0.25 0 Additive CDM:

M for the DINA model when

1 0.75 0.5 0.25 0 : DINA

M for the DINO model when

1 0.75 0.5 0.25 0 : DINO

M for the “and-within-or” model when

1 0.75 0.5 0.25 0 : “and-within-or”

M for the “or-within-and” model when

1 0.75 0.5 0.25 0 : “or-within-and”

A1 A2 A3 CDMs for Multiple Strategies “and-within-or”

A1 A2 A3 CDMs for Multiple Strategies “or-within-and”

Reduced Models Based on the Log and Logit Link Functions • Log link ( ): Log CDM Additive version  Reduced Reparameterized Unified Model / Generalized NIDA model • Logit link ( ): Log-odds CDM Additive version  Additive General Diagnostic Model

Estimation Details: The Logit Link • For the saturated model, the estimate of can be obtained by • The standard error of is computed from where is the gradient of

The parameters of the additive model cannot be written in closed form but the estimate of can be obtained by maximizing where • The standard error can be obtained from

Simulation Study - Design • The saturated and additive versions of the log and logit links were considered • For each model, I = 500, 1,000 and 2,000 examinees were used • The numbers of items and attributes were fixed at J = 30 and K = 5 • There were 10 one-, two- and three-attribute items in the test • 1,000 data sets for each of the 12 conditions were generated

Simulation Study - Preliminary Results Bias: Logit Link, Saturated Model

Simulation Study - Preliminary Results SE: Logit Link, Saturated Model

Simulation Study - Preliminary Results Bias and SE: Logit Link, Additive Model

Simulation Study - Preliminary Results Bias: Log Link, Saturated Model

Simulation Study - Preliminary Results SE: Log Link, Saturated Model

Simulation Study - Preliminary Results Bias and SE: Log Link, Additive Model

Summary • The G-DINA model is a general CDM that subsumes several commonly used CDMs • Results show that item-level estimates of CDMs based on other links can be obtained from the initial G-DINA model estimates • Additional work needs to be done to improve the accuracy and precision of the derived estimates • Implemented properly, this approach can increase the practical usefulness of CDMs

Jimmy de la Torre Rutgers, The State University of New Jersey Young-Sun Lee

Jimmy de la Torre Rutgers, The State University of New Jersey Young-Sun Lee

Presentation Transcript

Political Science RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY

Alan Robock Department of Environmental Sciences Rutgers University, New Brunswick, New Jersey USA

Alan Robock Department of Environmental Sciences Rutgers University, New Brunswick, New Jersey USA

Alan Robock Department of Environmental Sciences Rutgers University, New Brunswick, New Jersey USA

Alan Robock Department of Environmental Sciences Rutgers University, New Brunswick, New Jersey USA

Alan Robock Department of Environmental Sciences Rutgers University, New Brunswick, New Jersey USA

Alan Robock Department of Environmental Sciences Rutgers University, New Brunswick, New Jersey USA

Marie L. Radford, Ph.D. Associate Professor Rutgers, The State University of New Jersey

Sunil Somalwar Rutgers, the State University of New Jersey USA

Center for State Health Policy Rutgers, The State University of New Jersey Roberta Kelley

The State of New Jersey

State of New Jersey

Russell J. Kormann, Ph.D. Director - Project NSTM Rutgers, The State University of New Jersey

The State of New Jersey

Marie L. Radford, Ph.D. Associate Professor, Rutgers, The State University of New Jersey

The State of New Jersey

The State of New Jersey

Marie L. Radford, Ph.D. Associate Professor Rutgers, The State University of New Jersey

Rutgers, The State University of New Jersey D. Raychaudhuri ray@winlab.rutgers

Marie L. Radford, Ph.D. Associate Professor, Rutgers, The State University of New Jersey

Marie L. Radford, Ph.D. Associate Professor, Rutgers, The State University of New Jersey

Robert Ian Ochs Rutgers, The State University of New Jersey