350 likes | 383 Views
Using Differential Item Functioning Analyses to Enhance the Curriculum. Dr Juho Looveer ACSPRII Sydney December 2006.
E N D
Using Differential Item Functioning Analyses to Enhance the Curriculum Dr Juho Looveer ACSPRII Sydney December 2006
Using Modern Psychometric Theoryto Identify Differential Item Functioningin Polytomously ScoredConstructed Response ItemsLinking Results from Differential Item Functioning Analyses to the Curriculum
BIAS • Where one group has an unfair advantage over another “Educational or psychological tests are biased if the test scores of equally able test takers are systematically different between racial, ethnic, cultural, and other similar sub-groups.” (Kelderman, 1989, p. 681) “When a test item unfairly favours one group of students compared to another, the item is biased.” (Gierl, Rogers and Klinger, 1999, p. 2)
Impact • Where one group performs differently than another group “a between-group difference in test performance caused by group ability differences on the valid skill (e.g., the differences between the proportion correct for two groups of interest on a valid item). (Ackerman, 1994, p. 109)
Differential Functioning - Differential Item Functioning (DIF) “When persons from one group answer an item correctly more often than equally knowledgeable persons from another group, the item exhibits DIF.” (Ackerman, 1994, p. 142) “DIF refers to differences in item functioning after groups have been matched with respect to the ability or attribute that the item purportedly measures.” (Dorans and Holland, 1993, p. 37 )
Previous Methodology (1) • Test Level • Comparing Group Means • Meta Analyses • Item Level • Correlations • ANOVA • Factor Analyses • Other multivariate techniques • Most studies were based on unmatched samples
Previous Methodology (2) • Item Level with matched samples • Transformed Item Difficulty Index (TID-DIF) - Angoff 1972; 1982) • Contingency Table methods - eg Standardisation Method (Dorans & Kulick, 1986) • Chi-Square methods - eg Mantel-Haenszel (Holland-Thayer, 1988) • Logistic Regression
Previous Methodology (3) Previous methods used the simple sum of scores as a measure of ability. With classical test theory . . . “. . . perhaps the most important shortcoming is that examinee characteristics and test characteristics cannot be separated: each can be interpreted only in the context of the other. “ (Hambleton, Swaminathan and Rogers, 1991, p. 2) The results from one test can not be directly compared to the results from another test or another group of examinees.
Item level methods with students matched on ability (IRT/Rasch) • Comparative plots • Simple parameter designs • Model comparison measures • Item Characteristic Curves (ICCs) • Area between ICCs
Methods for Identifying DIF in Polytomous Items • Group means (Garner & Engelhard, 1999); • Standardised mean differences, correlations and covariance analyses (Pomplun and Capps, 1999); • Factor analysis (Wang, 1998); • Logistic discriminant function analysis (Hamilton & Snow, 1998; Miller & Spray, 1993); • Polynomial loglinear model (Hanson & Feinstein, 1997); • Mantel-Haenszel test and the generalised Mantel-Haenszel test (Cohen, Kim, & Wollack, 1998; Hamilton & Snow, 1998; Henderson, 2001); • Lord’s (1980) Chi-Square test (Cohen, Kim, & Wollack, 1998); • likelihood ratio test (Cohen, Kim, & Wollack, 1998; Kim, Cohen, Di Stefano & Kim, 1998); • Separate calibration and comparative plots, and between-fit (Smith, 1994; Smith 1996); • Poly-SIBTEST (Chang, Mazzeo & Roussos, 1996; Henderson, 2001; Zwick, Theyer & Mazzeo, 1997); and • Raju’s (1988; 1990) signed and unsigned area tests.
Using RUMM to Identify DIF in Polytomous Items (1) • RUMM produces separate Expected Value Curves (EVCs) for each group being considered • EVCs are based on mean scores for sub-groups (from the actual data) partitioned according to ability
Using RUMM to Identify DIF in Polytomous Items (2) Using the same data points as the EVCs are based on, RUMM calculates an Analysis of Variance (ANOVA) to assess the amount of DIF Expected Value Curves Showing DIF for Gender
Using RUMM to Identify DIF in Polytomous Items (3) Extract from Analysis of Variance Results for DIF – ITEM 1 [I0001:Q21a i]
Context of this study • New South WalesHigher School Certificate (HSC) • Mathematics in Society (MIS) examination • N= 2630 (from a total of 22,828 candidates in MIS)1130 males, 1481 females
Classifying Mathematical Skills, Knowledge and Understandings
Sample of identifying Skills necessary for deriving correct answers
Analyses of Data • Data for 71 items were analysed using RUMM 2010 • Item locations ranged from -2.909 to +2.246 • 9 Items showed poor fit to the model (based on residuals and chi-square values)
Future Directions . . . • Verifying actual results: • Are these results consistent? • Across time & other cohorts • Across other mathematics courses • Across other states • What are the causes of the DIF? • When and where do these differences first appear? • Are they due to teaching strategy or inherent weaknesses / differences?
General Comments • Identification of items where DIF is evident can be linked to actual curriculum areas. • Identifying skills which lead to DIF can indicate where students need more support. • Methodology demonstrated can be used for polytomous items and constructed response items in any subject area
? ? ? ? ? ? ? Questions . . .