E N D
1. Techniques for Explaining Item Response Theory to Stakeholder Kate DeRoche
Antonio Olmos
C.J. Mckinney
Mental Health Center of Denver
2. Explaing IRT to Stakeholder 2
3. Explaing IRT to Stakeholder 3 IRT in Evaluation There has been an increase in the application of IRT in evaluation
Due to the advantages it provides
(Hambleton, Swaminathan, Roger, 1991)
Multiple applications of IRT for evaluation purposes:
Psychometrics-measurement validation
Need to shorten a measurement tool
Rank elements of a dimension for difficulty
Equating instruments
Differential Item Functioning (DIF)
Item Response Theory Advantages:
*Separation of item parameters and participants ability
*Items are monotonically increasing in the latent trait
*Assumptions of unidemensionality & local independence
*Multiple models (1PL, 2PL & 3PL)
Add example for each one of theseItem Response Theory Advantages:
*Separation of item parameters and participants ability
*Items are monotonically increasing in the latent trait
*Assumptions of unidemensionality & local independence
*Multiple models (1PL, 2PL & 3PL)
Add example for each one of these
4. Explaing IRT to Stakeholder 4 Limitation to applying IRT in Evaluation: Less training in measurement as evaluators/researchers
Can teach IRT in a class but difficult to explain to people who do not understand psychometrics and advanced statistics.
PROBLEM: It is difficult for evaluators to apply Item Response Theory techniques because of the problems related to:
1. Explaining the advantages of IRT in simple terms
2. Explaining the results of IRT so that stakeholders can be involved in the analysis process
5. Explaining the Reasoning for Using IRT Why do we want to use a more complex method?
6. Explaing IRT to Stakeholder 6 Explaining Benefits of IRT Hallmark: Separation of item and person parameters (or item/person invariance)
Instead, “There are 2 things:
1. Are all of our items equal? Should they all equally contribute to our score or are some questions harder than other?
2. Test created with Classical Test Thoery can be very reliable (i.e. very consistent for a person to score the same, if measured twice) and not measure all of our participants well
Let me show you and example on the next pageLet me show you and example on the next page
7. Explaing IRT to Stakeholder 7 Are Items Equal? Often they are not equal, or assumed to be equal, you are asking about the assumption of a monotonically increasing functionsOften they are not equal, or assumed to be equal, you are asking about the assumption of a monotonically increasing functions
8. Explaing IRT to Stakeholder 8 Test level on Trait You could give both of the scale and then correlated them, but this still does not suggest which one is harder? You could give both of the scale and then correlated them, but this still does not suggest which one is harder?
9. Explaining IRT results to Stakeholder You can use IRT analysis and involve stakeholder!!!
10. Explaing IRT to Stakeholder 10 *First-go into detail regarding the scaling, and how it is a mean of 0 and standard deviations, and why it is from -4 to +4
Problems will arise those when this does not look like this, we will discuss 4 potential problems that may arise (clumping, range, ordering & gaps)*First-go into detail regarding the scaling, and how it is a mean of 0 and standard deviations, and why it is from -4 to +4
Problems will arise those when this does not look like this, we will discuss 4 potential problems that may arise (clumping, range, ordering & gaps)
11. Explaing IRT to Stakeholder 11 Problem 1: Clumping I did not show the people, but make sure that my analysis and desired range is where the person abilities scores lye.I did not show the people, but make sure that my analysis and desired range is where the person abilities scores lye.
12. Explaing IRT to Stakeholder 12 Problem #2: Full Range The scale does not contain any responses above (+0.5) suggesting that the highest IQ we can measure is 107.
People with IQ score higher than 107 (i.e. 130) would only be able to know that they have an IQ above 107.
THE RESPONSES NEED TO RANGE FROM +3 TO -3
The -3 to +3 could also be -2 to +2, or whatever your desired ability range of the sample. For our example we want a large range because we wanted to use it potential with a large range of people.
The -3 to +3 could also be -2 to +2, or whatever your desired ability range of the sample. For our example we want a large range because we wanted to use it potential with a large range of people.
13. Explaing IRT to Stakeholder 13 Problem #3: Ordering Notice that the order of easy to hard goes A,B,C, D, E, G, then F, suggesting that F and G are out of order.
A response of what we think is hard (only for supper-duper smart people) is really not that hard and will only produce an IQ score of 100, not 145 as assumed.
THE RESPONSES NEED TO BE IN CORRECT ORDER
14. Explaing IRT to Stakeholder 14 Problem #4: Large Gaps Again we see gaps, but they are within the scale (not just at the top or bottom).
This suggest that there are no items able to measure an IQ score between 141 to 107, and between 96 and 56
people are not able to receive score in this area.
WE CANNOT HAVE LARGE GAPS
The -3 to +3 could also be -2 to +2, or whatever your desired ability range of the sample. For our example we want a large range because we wanted to use
This is related to problem #1, because if we have large gaps then we usually have clumping
The -3 to +3 could also be -2 to +2, or whatever your desired ability range of the sample. For our example we want a large range because we wanted to use
This is related to problem #1, because if we have large gaps then we usually have clumping
15. Explaing IRT to Stakeholder 15 Activity with Stakeholder Keep in mind the 4 problems presented
(C.R.O.G.)- Clumping, Range, Order & Gaps
Present the results of the IRT item map
Have problems listed with potential reason and solutions
Should have also previously reviewed other IRT output for the software program (ICC’s, Information functions, a, b, or c parameters, infit, outfit, etc…)
Stakeholders were able interpret the results in terms of their program
Added context to the results
Most importantly, stakeholders felt that they were involved in the process
We would also use the problems listed on the item maps as questions for our clinicians (like mini-focus groups)
I would keep reminding them of the 4 problems (could also give out a handout with all 3 of them on there)
Stakeholders can up with problems right away, but sometimes missed some of them so I listed them on the bottom and the potential solutions
The stakeholders could understand where the changes to the survey were coming from without every using a statistical term, so I wasn’t just making things up
Stakeholders felt more involved, this was determined through informal feedback, and plan to systematically measure stakeholders options of this activity in the future.
We would also use the problems listed on the item maps as questions for our clinicians (like mini-focus groups)
I would keep reminding them of the 4 problems (could also give out a handout with all 3 of them on there)
Stakeholders can up with problems right away, but sometimes missed some of them so I listed them on the bottom and the potential solutions
The stakeholders could understand where the changes to the survey were coming from without every using a statistical term, so I wasn’t just making things up
Stakeholders felt more involved, this was determined through informal feedback, and plan to systematically measure stakeholders options of this activity in the future.
16. Explaing IRT to Stakeholder 16 How to Present the Results Two ways the item map was presented
1. Display only the current item map
Beneficial for individuals familiar with looking at data or graphs
Beneficial for long measures when viewing individual items
2. Present the current item map and ideal item map, with suggestions and explanations at the bottom
Beneficial for individuals not as familiar with data, graphs, or have “number anxiety”
Ideal for short measure when viewing individual items because it takes up a lot of room We will display both ways, nextWe will display both ways, next
17. Explaing IRT to Stakeholder 17 Example 1: Current Item Map Multiple on one page (again only important for item level analysis)Multiple on one page (again only important for item level analysis)
18. Explaing IRT to Stakeholder 18 We found this one to be the most benefited with out stakeholders, because it helped them to understand exactly what was going onWe found this one to be the most benefited with out stakeholders, because it helped them to understand exactly what was going on
19. Explaing IRT to Stakeholder 19 Applications in Other Domains Any commonly understood test that has a normed mean and standard deviation
Education- standard test scores
SAT, GRE, K-12 state testing, t-scores
Health- any biological measure
heart rate, blood pressure, blood sugar Should I add the examples on the scale here? Should I add the examples on the scale here?
20. Explaing IRT to Stakeholder 20 Questions??? For future questions or comments:
Kate DeRoche
Kathryn.DeRoche@MHCD.org
(303) 504-6664
21. Explaing IRT to Stakeholder 21 IRT Resources IRT 101
Reise, S. P., Ainsworth, A. T. & Haviland, M. G. (2005). Item Response Theory: Fundamentals, Applications, and Promise in Psychological Research. Current Directions in Psychological Science, 14, 95 – 101.
IRT Books
Hambleton, R. K., Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of Item Response Theory. Newbury Park, CA: Sage Publication, Inc.
Embretson, S. E. & Reise, S. P. (2000). Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Erlbaum Associates
And many more resources