1 / 26

Presented by Zhu Jinxin

Comparison of Reliability Measures under Factor Analysis and Item Response Theory —Ying Cheng , Ke-Hai Yuan , and Cheng Liu. Presented by Zhu Jinxin. Outline of the P resentation. Introduction of four reliability coefficients: a , w , p , and r The relationship among them

chinara
Download Presentation

Presented by Zhu Jinxin

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparison of Reliability Measures under Factor Analysis and Item Response Theory—Ying Cheng,Ke-Hai Yuan,and Cheng Liu Presented by Zhu Jinxin

  2. Outline of the Presentation • Introduction of four reliability coefficients: a, w, p,and r • The relationship among them • Conclusion and discussion

  3. Cronbach’s alpha • One of the definitions is • K is the number of components (itemsor testlets) • sX2 is the variance of the observed total test scores, • sYi2 is the variance of component i for the current sample of persons.

  4. Cronbach’s alpha’s feature • It is most widely used • Raw sum score is used • a may underestimates reliability at population level, when the assumption of essential tau-equivalency is violated

  5. about Tau-equivalency

  6. about Tau-equivalency

  7. about Tau-equivalency In this case, the reliability is underestimated by a, which is only a lower-bound estimate of the true reliability of scale when measures are congeneric .

  8. w & r in congeneric measuresin Single-factor model

  9. w & r in congeneric measuresin Single-factor model Suppose we have m items

  10. w & r in congeneric measuresin Single-factor model Variance of true score Variance of unweighted composite score

  11. feature of w 1.It neglects that people with the same sum score can have completely deferent response patterns. 2.w≧a, when

  12. w & r in congeneric measuresin Single-factor model r≧w≧a when is w equal to r?

  13. Reliability in IRT • The variance of the MLE is (approximately) given by the inverse of the information • The variance of q is 1 in MLE, in which • The study use information in a broader sense by equating it with the inverse of a variance even when the parameter estimate is not an MLE • so

  14. w from information perspective

  15. r from information perspective

  16. w & r from information perspective

  17. Reliability in IRT • With a single parameter, I, the information is defined as the negative expected value of the second derivative of the log likelihood function. • The IRT models directly relate the discrete responses to an underlying latent factor. • When q is normally distributed, the normal ogive IRT models are equivalent to the item factor analysis model.

  18. Reliability in IRT • For binary response Where id the response and Approximately

  19. Reliability in IRT • For binary response

  20. Reliability in IRT • For binary response The information is defined as the negative expected value of the second derivative of the log likelihood function: For each item For test

  21. Reliability in IRT • For binary response the reliability is and (the deduction is put in the appedix)

  22. Reliability in IRT • For response of ordered categories, supposing the continuous response to item j is discretized by g threshold. • The information of jth item is given by

  23. The relationship • r≧w≧a • It is expected that • There is no dominant relationship between p(2) • Simulation demonstrated that, as the number of response increase, p can exceed w in practice.

  24. Conclusion • Keep as many many response categories as possible and use ML factor score. • However, after having a certain number of response options, it may not be worth adding more.

  25. Discussion • Only graded response (order categories) models is studied. (comparing to other types polytomous IRT models) • Only unidimensional models are studied.

  26. Thank you!

More Related