IRT Equating

IRT Equating Kolen & Brennan, 2004

IRT • If data used fit the assumptions of the IRT model and good parameter estimates are obtained, we can estimate person abilities independent of the particular items (sample invariance) • When forms are scaled similarly, a person would obtain the same ability estimate regardless of the specific form taken

IRT Equating • Because of the sample invariance properties of IRT, equating test forms is simply a process of placing item parameters on the same scale • Samples of 2000 seem adequate while 3000 is preferred for calibrating 3-PL models • 3000 is more than needed typically for linear equating • 3000 is not enough for most equipercentile methods

IRT Scale Transformation • With nonequivalent groups, parameters from different tests need to be placed on the same scale. • If an IRT model fits the data, any linear transformation of the -scale also fits the data. • A linear equation can be used to convert IRT parameter estimates to the same scale. • For 2 groups on a common scale, the Ms and SDs will differ.

Item Parameters Scale I Scale J Items aJj bJj cJj Person Ji • Items • aIj • bIj • cIj • Person • Ii

Transformation • Scale I and Scale J are 3-PL IRT scales that differ by a linear transformation • -values for two scales are related as: Ji = A Ii + B

Transformation Equations • Item parameters on the 2 scales are related Note that the lower asymptote is independent of the scale transformation; because the c parameter is estimated from the probability metric and not the ability metric, no transformation is needed.

A and B Constants

Example (Table 6.1) Ji = A Ii + B = .5(-2.00) + (-.5) = -1.5

Expressing A & B by Groups • It is more typical to express these relations in terms of groups of items or people The means and SDs are based on groups of items or persons

Example (Table 6.1)

In Practice • In nonequivalent group designs, a common-item set (anchor items) is used on each form. • Parameter estimates from common items are used to find the scaling constants • Conversion formulas using  can only be used when the same population is used to estimate parameters on both scales

Estimating Parameters: Random Groups Equating Design • Groups are assumed to be randomly equivalent • If the same IRT scaling convention is used (e.g., M = 0, SD = 1), parameter estimates for the two forms are on the same scale • No further transformation is necessary

Estimating Parameters:Single Group Designwith Counterbalancing • The parameters for all examinees on each form can be estimated simultaneously.

Estimating Parameters:Nonequivalent Group Anchor Test Design • Item parameters estimated on common items can be used to find the transformation constants • Alternatively, parameters on Forms X and Y can be estimated simultaneously through concurrent calibration – considering unique items on each form as “not administered”

Mean/Sigma Transformation • Uses the means and SDs of the b-parameters from the common items • The values of A and B are then substituted into the rescaling equations. Ji = A Ii + B, and those for the item parameters

Equating True Scores • The equating is complete if we report ability estimates; s are the same (within measurement error) when estimated from item parameters on the same scale. • If scaled scores are reported,  can be used to estimate true scores on the two forms which can be used as equated scores through the TCC. • However, we equate observed scores, not true scores; studies suggest similar results

Equating True Scores • s are transformed to true scores on both Form X and Y • Create conversion of true scores to the reporting scale (using observed scores) New Form Observed Score  i Old Form Observed Score  Scaled Score No justification exists for using observed scores.

Equating Observed Scores • The IRT model is used to produce an estimated distribution of observed scores on each form • Observed score distributions are equated through equipercentile methods

IRT Equating

IRT Equating

Presentation Transcript

Equating And Scaling

IRT

Linking Equating

Standard Error of Equating

IRT Interview

IRT Research Overview

April IRT Meeting

Adventures in Equating Land:

A Practitioner’s Introduction to Equating

Test Equating

IRT UPDATES

Sidevõrgud IRT 4060/ IRT 0020 vooruloeng 2 / 22. sept 2004

Current IRT Projects

Introduction to IRT

December 2004 IRT

LECTURE 14 NORMS, SCORES, AND EQUATING

CDG IRT Co-Chair

Linking & Equating

Rasch vs. IRT

Linking & Equating

Sidevõrgud IRT 4060/ IRT 0020 vooruloeng 2 / 22. sept 2004

Test co-calibration and equating

IRT Equating

IRT Equating

Presentation Transcript

Equating And Scaling

IRT

Linking Equating

Standard Error of Equating

IRT Interview

IRT Research Overview

April IRT Meeting

Adventures in Equating Land:

A Practitioner’s Introduction to Equating

Test Equating

IRT UPDATES

Sidevõrgud IRT 4060/ IRT 0020 vooruloeng 2 / 22. sept 2004

Current IRT Projects

Introduction to IRT

December 2004 IRT

LECTURE 14 NORMS, SCORES, AND EQUATING

CDG IRT Co-Chair

Linking &amp; Equating

Rasch vs. IRT

Linking &amp; Equating

Sidevõrgud IRT 4060/ IRT 0020 vooruloeng 2 / 22. sept 2004

Test co-calibration and equating

Linking & Equating

Linking & Equating