200 likes | 537 Views
Assessing Structural and Metric Equivalence: A Case Study. Fons J. R. van de Vijver Tilburg University, the Netherlands and North-West University (Potchefstroom Campus), South Africa Chantale Jeanrie Laval University, Canada. Outline. Theoretical and Methodological Background
E N D
Assessing Structural and Metric Equivalence: A Case Study Fons J. R. van de Vijver Tilburg University, the Netherlands and North-West University (Potchefstroom Campus), South Africa Chantale Jeanrie Laval University, Canada
Outline • Theoretical and Methodological Background • Structural and metric equivalence in translations/adaptations • Example • Adaptation of the California Personality Inventory (CPU-434) for use among French-Canadians • Conclusion
Theoretical and Methodological Background • Crucial concept in translations/adaptations is equivalence: • Linguistic • Mapping of linguistic aspects of meaning (word meaning, sentence meaning) • Psychological • Mapping of psychological meaning (serves the same psychological function in all languages?) • A good translation/adaptation combines these considerations
Equivalence in Adaptations • Structural Equivalence • Does the instrument measure the same underlying construct in all language versions factor analysis • Metric Equivalence • Can scores be compared across all language versions? Item Bias, also known as Differential Item Functioning (DIF)
Example • Adaptation of the California Personality Inventory (CPU-434) for use among French-Canadians (Jeanrie & Van de Vijver, in preparation) • Project modeled along Guidelines on Adapting Tests by the International Test Commission (www.intestcom.org) (Hambleton, 1994)
Participants • 1129 English-speaking and 1018 French-speaking Canadians • Mainly college and university students (social science and law) • Majority of both language groups were female • The English-Canadian group had an average age of 23.53 yrs (SD = 7.53), the French-Canadian group an average of 20.96 yrs (SD = 5.94).
Instrument • The latest version of the California Psychological Inventory (CPI; Gough, 1996) • 434 items, measuring 20 basic folk scales and 3 vector scales: • Scales : Do (Dominance), Cs (Capacity for Status), Sy (Sociability), Sp (Social Presence), Sa (Self-Acceptance), In (Independence), Em (Empathy), Re (Responsibility), So (Socialization), Sc (Self-Control), Gi (Good Impression), Cm (Communality), Wb (Well-being), To (Tolerance), Ac (Achievement via Conformity), Ai (Achievement via Independence), Ie (Intellectual efficiency), Py (Psychological-Mindedness), Fx (Flexibility), F/M (Femininity/Masculinity) • Vector scales are V1 (Externality/Internality), V2, (Norm-Doubting/Norm-Favoring) and V3 (Realization). • Three scales are meant to detect response styles: faking good, faking bad, and random responding • The response scale is dichotomous (true/false).
Translation/Adaptation Procedure • Four independently working translators with an academic background in psychology or education • Both English and French was present as the first language in the group • All were given written instructions as to the kind of translation that was expected from them, as well as instructions on how to write test items.
Adaptation Procedure • Step 1: • Each translated item was analyzed by a team of five (other) bilingual judges • A four-point was used to rate conceptual equivalence: “Compared to the meaning of the original item, the meaning of the translated item is: 1) identical, 2) rather similar, 3) rather different or 4) different.” • Step 2: • Two researchers combined the results and prepared preliminary version of the French CPI • Many items adapted, few items extensively changed
Step 3: • Pilot of the French version: Two research assistants conducted (two-hour) interviews with twelve participants from Quebec and New Brunswick • Step 4: • Composition of final instrument
Results: Internal Consistencies • Median Cronbach’s alpha of 20 scales is .70 in French-Canadian group and .69 in English-Canadian group • Values quite comparable to • each other (two scales showed significantly higher values in French Canadian group) • U.S.A. values (reported by Gough)
Results: Construct Equivalence • To what extent do the scales measure the same in both cultural groups? • We did not find unequivocal support of Gough’s (empirically derived) scales • 20 scales Gough 31 scales current study
Equivalence Analyses • Comparison of factors in 4 groups: male and female English-Canadian and French-Canadian samples • Boxplot of values of Tucker’s phi: Conclusion: Strong evidence for structural equivalence
Item Bias/DIF • Uniform and nonuniform bias studied • Logistic regression analysis • Independent Variable: • Culture (2 levels), Score Level (4 levels) • Dependent Variable: • Item response (dichotomous) • Indicators of Bias: • Effect size evaluated as partial correlation between independent variables (culture or interaction) and dependent variable; Cohen’s cutoff values (conservative): .10, .25, and .40 • Proportion of significantly biased items
Mean Effect Size and (b) Proportion of Biased Items (a) Mean Effect Size: M = .03, SD = .01 (b) Proportion of Biased Items: M = .61, SD = .09
Correlations of Bias Statistics and Item Characteristics aDouble apostrophes indicate non-literal word usage.
Conclusion • Quality of an adaptation is the net result of the quality of various stages and a long chain of interdependent decisions • Structural Equivalence: • Strongly supported • Metric Equivalence: • Many items showed small bias, their removal does influence the size of the cross-cultural differences observed
Analysis of nature of bias: • More bias in items • that showed a larger difference in means across the two groups, • that had lower endorsement rates, • that contained words with apostrophes • The removal of the biased items had a remarkably small on the size of the mean differences of the two groups. • Conclusion: combined expertise/skills in language, culture, and research methodology and statistics can yield equivalent instruments