1 / 41

Lesson 5.1 Evaluation of the measurement instrument: reliability I

Lesson 5.1 Evaluation of the measurement instrument: reliability I. 1. The problem of the measurement error. One of the basic requirements in any measurement process is the accuracy.

wbarfield
Download Presentation

Lesson 5.1 Evaluation of the measurement instrument: reliability I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lesson 5.1Evaluation of the measurement instrument: reliability I

  2. 1. The problem of the measurement error • One of the basic requirements in any measurement process is the accuracy. • Measurement error: difference between the empirical score obtained by a person in a test and his/her true score, being test any psychological measurement tool. • - If we apply the same test n times to the same person the scores obtained, although similar, will never be the same  the scores will be affected by the random errors (motivation of the person, his/her state of mind, etc.) that cause that the empirical score is different to the supposed true score of the person. • -How can we know the real score of the person in the construct that we are measuring?  Linear Model by Spearman (CTT):

  3. 2. Linear Model by Spearman. The empirical score obtained in a test by a person is equal to two components: his/her true score in the construct measured plus his/her measurement error. X = T + E

  4. 2. Linear Model by Spearman. Assumptions 1. The expected value of the random variable measurement error is equal to zero for a population measured with the same test, or for infinite measures for just one person. 2. The true scores and their errors are not correlated  a systematic patter of positive or negative errors does not exist.

  5. 2. Linear Model by Spearman. Assumptions 3. The measurement errors of two different tests are not correlated. This assumption does not seem to be reasonable is scores affected by factor such as the tiredness, practice or state of mind (Allen y Yen, 1979). 4. Thee of a test are not correlated whit the t scores of a different test The CTT considers the measurement error a random deviation, non-systematic, of the true score.

  6. 2. Linear Model by Spearman. Deductions 1. Because E(Ei)=0, the expected value of X is equal to the expected value of T  the means in the population are equal. 2. Because E(Ei)=0 and the errors are independent from the true scores, the covariance (COV) across true scores and errors is zero.

  7. 2. Linear Model by Spearman. Deductions 3. Because T and E are independent, the variance (VAR) of X is equal to the sum of the VAR of V plus the VAR of E. 4. Because the COV across E and T is zero, the COV across X and T is the VAR of T.

  8. 2. Linear Model by Spearman. Deductions 5. Because the COV across X and T is equal to the VAR of T, the correlation between X and T is the proportion of the variability of T over the variability of X. Reliability index The reliability index squared is the reliability coefficient The reliability coefficient represents the proportion of the VAR of X explained by its linear relationship with T.

  9. 2. Linear Model by Spearman. Deductions 6. The correlation squared between E and X is equal to the VAR of X non-explained by its relationship with T, but by its linear relationship with E. 7. Based on the previous formula, the reliability coefficient can be also expressed as 1 minus the correlation squared between X and E. When the VAR of the errors is small, the reliability coefficient will be high

  10. 2. Linear Model by Spearman. Examples • The result when dividing the standard deviation of the errors by the standard deviation of the empirical scores, is 0.45. Calculate the reliability coefficient. 2. Calculate the reliability coefficient of a test knowing that the VAR of the empirical scores is equal to 36 and the standard measurement error is 3. 3. Calculate the reliability coefficient if the proportion of the VAR of the true scores that is in the VAR of the empirical scores in a test is 0.9.

  11. 2. Linear Model by Spearman. Examples 1. 2. 3.

  12. 3. Parallel tests. Conditions of parallelism The reliability coefficient is impossible to be empirically calculated because we do not know the value of T and E. Thatis why Spearman defined the concept parallel tests. Two tests X and X´ where the previous assumptions are accepted are parallel when: 1. The true scores are equal in both tests: 2. The VAR of the measurement errors is equal in both tests:

  13. 3. Parallel tests. Conditions of parallelism. Deductions • Because E(Ei)=0, the mean of the empirical scores obtained in two supposedly parallel tests is equal: 2. Because the VAR of the E is the same, the VAR of the X obtained in two parallel tests are also the same:

  14. 3. Parallel tests. Conditions of parallelism Deductions 3. The correlation between the two scores obtained in two parallel tests is equal to the correlation square between the empirical and the true scores: Practical consequences because we can express the reliability coefficient as the correlation between two parallel tests 4. When we have two or more than two parallel tests, the correlations between each pair are equal:

  15. 3. Parallel tests. Conditions of parallelism After estimating the reliability coefficient, the VAR of T and E can be also estimated. 5. Isolating the VAR of V from the previous equation, we find that the VAR of the true scores can be calculated multiplying the VAR of the empirical scores by the correlation between the parallel measures: 6. Isolating from the previous equation, we find that the VAR of E can be calculated multiplying the VAR of X by one minus the correlation between the parallel measures: Standard error of measurement

  16. 3. Parallel tests. Conditions of parallelism. Path diagram 2. Empirical scores in test Y1, supposedly parallel to Y2 1. True score 3. Measurement error 4. Change in Y for each unit of change in eta. Under this logic, two tests are strictly parallel when:

  17. 3. Parallel tests. Conditions of parallelism. Path diagram. Strictly parallel tests • That is, Y1and Y2are strictly parallel if the relationship between T and Y, and the VAR of their corresponding measurement errors are exactly the same. 1. Under the assumptions of the CTT, we obtain that the COV between both tests is equal to the VAR of the true score. 2. The VAR of Y is equal to the sum of the VAR of T plus the sum of the VAR of E. 1. VAR of each Y 2. COV (that is equal to the VAR of T)

  18. 3. Parallel tests. Conditions of parallelism. Tau-equivalent measures Do parallel tests really exist or it is just a theoretical chimera? Even when we obtain equal T, it is difficult to find exactly equal E  We can make this assumption flexible and obtain Tau-equivalent measures.

  19. 3. Parallel tests. Conditions of parallelism. Tau-equivalent measures 1´. Under the assumptions of the CTT, we obtain again that the COV between both tests is equal to the VAR of T. 2´. Nevertheless, in the second assumption, the VAR of Y would have a error term different for each testnow. 1. VAR of each Y, with its specific E 2. COV = VAR of T

  20. 3. Parallel tests. Conditions of parallelism. Congeneric measures If we make the assumptions even more flexible, we obtain congeneric tests. That implies that the T obtained with the two tests are linear transformations from each other

  21. 3. Parallel tests. Conditions of parallelism. Congeneric measures 1´´. Under the assumptions of the CTT, now the COV between Y can be calculated multiplying the lambdas and the VAR of V. 2´´. And the VAR of Y is lambda square multiplied by the VAR of T plus the VAR of E.

  22. 4. Theoretical interpretation of the reliability coefficient …the correlation between the empirical scores obtained by a sample in two parallel tests. Quotient between the VAR of T and the VAR of X the proportion of X due to V 2. When the T distance from the X E increase 1. When the T move closer to the X  E decrease

  23. 4. Theoretical interpretation of the reliability coefficient • The measures do not present E • X=T for all the participants • The VAR of X is equal to the VAR of T • All the differences in X imply differences in T • The correlation between X and T is = 1 • The correlation between X and E is = 0 • The X include only E • X = E for all the participants • The VAR of X is equal to the VAR of E • All the differences in X imply measurements errors • The correlation between X and T is = 0 • The correlation between X and E is = 1

  24. 4. Theoretical interpretation of the reliability coefficient Example. Calculate the reliability coefficient of an abstract reasoning test, knowing that the true VAR of this test is the 80% of its empirical VAR.

  25. 4. Theoretical interpretation of the reliability coefficient That is, the 80% of the VAR of the empirical scores is true measure of the construct. The reliability coefficient can be also expressed based on the VAR of the errors (deduction number 7).

  26. 4. Theoretical interpretation of the reliability coefficient Example. We can obtain xx’ subtracting 1 minus the VAR of the empirical scores due to the measurement error Isolating in the previous equation, we can obtain the standard error of measurement: group measurement of the error; that is, the difference between X and V for the sample chosen

  27. 5. Types of measurement errors 1. Measurement error: difference between the empirical score of a participant and his/her true score The standard error of measurement is the standard deviation of the measurement errors of all the participants of the sample  group measurement of error 2. Error of estimation of the true score: difference between the T of a participant and his/her estimated true score The standard error of estimation of the true score

  28. 5. Types of measurement errors 3. Substitution error: difference between the score obtained by a participant and the score obtained by the same person in a parallel test. Standard error of substitution 4. Prediction error: difference between the scores obtained by a participant in a test and his/her predicted score in the same test. Standard error of prediction

  29. 6. Factors that affect the reliability • The reliability of a test depends on factors such as: • The length of the test. • The variability of the sample. This implies that the accuracy of the instrument depends on the people that are measured, fact that is not desirable in any measurement process

  30. 6.1. Factors that affect the reliability. The test length The higher number of items, the higher reliability  When we have more items, we obtain more information about the feature we are measuring. The relationship between reliability and length is calculated this formula by Spearman-Brown • RXX’ = reliability coefficient of the test shorten or lengthen • rXX’= reliability coefficient of the original test • n = number of times the test is lengthen: n=FE/IE; • where FE = number of final elements; • IE = number of initial elements

  31. 6.1. Factors that affect the reliability. The test length Example. A sample fulfilled a 50-item test. It obtained a reliability coefficient of 0.60. Which would be the reliability coefficient if the length of the test increases twice? And if it increase 6 times?

  32. 6.1. Factors that affect the reliability. The test length 1. The reliability coefficient pass from 0.60 to 0.75 2. If we increase 6 times the length, the reliability coefficient is 0.9 3. When n approach infinite, the reliability coefficient approach 1. This limit would be approached if infinite items are added

  33. 6.1. Factors that affect the reliability. The test length We can see in the graphic that the reliability coefficient increases when n increases; nevertheless: 1. First, the increase is more pronounced in more reliable tests 2. When n increases, the increase of the reliability softens 3. The increase is lower when rxx’ is lower.

  34. 6.1. Factors that affect the reliability. The test length Until which point would be worthwhile to increase the number of items to obtain a significantly better reliability than the original?  How many items should we add to obtain a concrete value of reliability coefficient? Spearman-Brown allows to estimate the necessary n to obtain a concrete value of the reliability coefficient Based on the previous example, we would like to obtain a reliability coefficient of 0.90. How many parallel elements should we add?

  35. 6.1. Factors that affect the reliability. The test length We should increase the length 6 times, so we would have a test with 300 elements. Therefore, we should add 250 items.

  36. 6.2. Factors that affect the reliability. The variability of the sample The reliability coefficient can vary depending on the higher or lower homogeneity of the group. The higher the variability is, the higher the reliability is. Formula that connects variability and reliability

  37. 6.2. Factors that affect the reliability. The variability of the sample Example. A sample fulfilled a test. The standard deviation of the empirical scores was 20. The proportion between the standard deviation of errors and the standard deviation of the empirical scores was 0.4. Furthermore, the same test was fulfilled by another sample, where the standard deviation of the empirical scores was 10. Calculate the reliability coefficient of the test in the second sample.

  38. 6.2. Factors that affect the reliability. The variability of the sample 1. Calculations to obtain the reliability coefficient in the original sample. It is 0.84 2. Using the formula, we obtain that the reliability coefficient in the second sample was 0.36 the lower the variability is, the lower the reliability

  39. 7. Reliability as equivalence and stability of the measurements • Stability: when a feature is measured with the same test in different occasions, when the feature has not changed, we should obtain very similar scores. • - The main methods to calculate the reliability coefficient that are based on the correlation are: • Parallel forms reliability • Test-retest reliability

  40. 7.1. Reliability as equivalence and stability of the measurements. Parallel forms reliability • Stages: • Construct two parallel forms of a test: X and X’. • A sample big enough fulfills the two forms of the test. • Calculate Pearson correlation coefficient. Equivalent coefficient (degree in which both forms are equivalent) Advantage: the situation is more controlled because we can obtain the answers to both forms in the same moment. Disadvantage: it is really difficult to construct parallel forms.

  41. 7.2. Reliability as equivalence and stability of the measurements. Test-retest reliability The same test is answered by the same sample in two different occasions. After that, we calculate the correlation coefficient (stability coefficient). -Advantage: it is not necessary to elaborate two parallel forms. - Disadvantages: - memorization of some items: it implies an unreal increase in the participants’ scores. - temporal interval: a long time between measures is recommended, but then there is a risk: the feature that we are measuring can change. - participants’ attitude: a change in the cooperation can cause a higher or lower score.

More Related