550 likes | 1.1k Views
Consequences of using Rasch models for educational assessment: Where are we today?. OutlineEducational Assessment and the Rasch modelThe Rasch model in briefAssessing student competencies in PISAA Rasch model for educational assessmentsA Rasch model for longitudinal dataA Rasch model for compe
E N D
1. Consequences of using Rasch models for educational assessment: Where are we today? Claus H. Carstensen,
Institute for Science Education IPN Kiel, Germany
IRDP Neuchâtel, January 14, 2008
2. Consequences of using Rasch models for educational assessment: Where are we today? Outline
Educational Assessment and the Rasch model
The Rasch model in brief
Assessing student competencies in PISA
A Rasch model for educational assessments
A Rasch model for longitudinal data
A Rasch model for competency profiles
Where are we today? 2
3. Educational Assessment and the Rasch model Consequences of using Rasch models for educational assessment
4. Educational Assessment to diagnose a student‘s progress (to support teaching and learning)
to evaluate/mark a performance (in accountability systems)
to assess the quality of a system (to provide support to policy makers)
compare educational outcomes
to the performance of others
or to a standard 4
5. Educational Assessment Research questions in Educational Research
describe distributions of educational outcomes
analyze the relation of some outcome with conditions of teaching and learning
explain how outcomes like competence are related on teaching or context conditions 5
6. How does the Rasch model help us with these issues? The Rasch model is a measurement model for educational outcomes
it assumes a latent trait that explains the probability of the item responses
it assumes unidimensionality or “Rasch homogeneity” of the items
this assumption can be tested empirically for any test and a given dataset 6
7. How does the Rasch model help us with these issues? The “original” Rasch model (Rasch, 1960) addresses dichotomous responses, generalizations address more complex response data like
multi-categorical data (Rasch, 1961),
ordinal data (Andrich, 1978; Masters, 1982)
or even continuous rating scales (Müller, 1987) 7
8. How does the Rasch model help us with these issues? Other generalizations model more heterogeneity in the response data,
Mixture Distribution Rasch Model (Rost, 1989; Yamamoto, 1987)
Multdimensional Rasch Models (Stegelmann, 1983; Andersen, 1985) 8
9. How does the Rasch model help us with these issues? Further generalizations of the Rasch model combine measurement and data analysis into one model
the Linear Logistic Test Model LLTM (Fischer, 1972) models structures in a test, i.e. systematically constructed items.
Explanatory Item Response Models (De Boeck & Wilson, 2004) model structures in the test and in the population under investigation
9
10. Outline again
Educational Assessment and the Rasch model
The Rasch model in brief
Assessing student competencies in PISA
A Rasch model for educational assessments
A Rasch model for longitudinal data
A Rasch model for competency profiles
Where are we today? 10
11. The Rasch model in brief Consequences of using Rasch models for educational assessment
12. The Rasch model in brief:Modeling item response probabilities Using the logit function as item characteristic curve ICC 12
13. The Rasch model in brief:An assumption Assuming the response probability is only determined by
an person ability
and an item difficulty
gives the model equation 13
14. The Rasch model in brief:Modeling item response probabilities An item characteristic curve ICC
14
15. The Rasch model in brief:Likelihood multiplying the model probability over items x subjects gives the likelihood
where the number of correct responses per subject and is the number of correct responses per items
is a sufficient statistic for a person’s ability 15
16. The Rasch model in brief:model properties the model is one dimensional
it assumes equally discriminating items
the order of items is the same for all persons
the order of persons is the same with all items
(the graph shows ICCs of three items)
16
17. The Rasch model in brief:parameter estimation Parameters are estimated by maximizing the likelihood with respect to each parameter separately
A set of one estimation equation for each parameter has to be solved, which can be done with iterative maximization algorithms.
In a Newton Raphson algorithm, an estimation equation for a person parameter is:
17
18. The Rasch model in brief: parameter estimation JML Joint Maximum Likelihood)
joint calibration of item and person parameters, not unbiased because of incidental parameters (one parameter per person)
CML conditional ML
estimate item parameters only,given the subject abilities with their scores
MML marginal ML
estimate item parameters making an assumption (like a normal distribution) on the subject ability distribution
19. Outline again
Educational Assessment and the Rasch model
The Rasch model in brief
Assessing student competencies in PISA
A Rasch model for educational assessments
A Rasch model for longitudinal data
A Rasch model for competency profiles
Where are we today? 19
20. A Rasch model for educational assessments Assessing student competencies in PISA
21. A Rasch model for educational assessmentsPISA study Now, looking at the PISA study as an example for Educational assessment
Purpose of PISA
monitor the outcomes of educational systems
across participating countries
over time within participating countries 21
22. A Rasch model for educational assessmentsSystem monitoring Multi matrix sampling allows to use more task than single students could work on
only a few items for each student and assessment domain
more information on aggregated levels
The item response model has to equate the different test form (booklets) 22
23. A Rasch model for educational assessmentsPopulation level results Results on competency distributions like means variances, percentiles or percentage above cutpoints are requested on the country level
Using individual level competency estimates leads to biased variance estimates for the population
Instead population parameters are estimated directly with latent regression models in MML
some simulation results (a few slides later) will illustrate this 23
24. A Rasch model for educational assessmentsMultidimensional models in PISA three domains are assessed, maths, reading & science
a multidimensional model assumes a multivariate normal distribution of the competencies
and estimates its parameters (variances & covariances) 24
25. 25 The response Model for longitudinal dataitem response model The Model presented here is a sub model of the MRCMLM (Adams, Wilson, Wang 1997, implemented in ConQuest)
response model
population model
latent regression model
26. 26 A Rasch model for educational assessments Aggregated and individual level analysis Population model results may (theoretically) be read from the covariance matrix and the regression coefficients
Multiple Imputations (Plausible values) are drawn to obtain complete data sets on the individual level for further analyses of the solution, using standard statistical software
given the appropriate conditioning model, hierarchical or structural equation models may be fitted using the plausible values
27. A Rasch model for longitudinal data Assessing student competencies in PISA
28. A Rasch model for longitudinal dataPISA-I-Plus Germany 2003 9th graders assessed in 387 classrooms,
2 classrooms per school,
different types of school (Gymnasium, Realschule, Hauptschule)
Second assessment of the same students one year later:data from 297 10th grade classrooms in the analysis
Assessments in mathematics and science
Questionnaire information is available from schools, teachers, students and parents (first or both assessments) 28
29. A Rasch model for longitudinal data PISA Germany 2003 General Research Questions
How do mathematical and scientific literacy grow/change from grade 9 to grade 10
for the whole population and for particular subpopulations (gender, socioeconomic status, migration, type of school)
How are these changes related to context and treatment conditions?
Instruction in the last year and school level conditions
Student level variables, parental support, peers etc. 29
30. A Rasch model for longitudinal data multidimensional modeling of time points two dimensional response model assuming a bivariate normal distribution
Item parameters fixed across time points, both dimension measure the same construct
The latent correlation reflects the connection between observed responses from the same personsat different time points
Andersen (1985), CML estimated two dimensional model 30
31. 31 reformulation of the dimensions as
pretest proficiency and change(the change model)
Embretson (1991), CML estimated difference model
Fischer (2000, 2003), CML estimation and individual confidence intervals A Rasch model for longitudinal data tamultidimensional modelling of timepoints
32. 32 Multidimensional and latent regression model:
latent correlations between change and covariates are modeled
Submodel of MCML model (ConQuest ) A Rasch model for longitudinal data the combined model
33.
A Rasch model for longitudinal data some results
34. A Rasch model for longitudinal data Summary A multidimensional MML estimated IRT model with latent regression to analyze true change was presented(which is a sub model of MRCMLM in ConQuest)
time point scores are modeled as multivariate normal distributed (latent correlation estimated)
change can be specified as latent dimension => latent correlation between change and background variables/ treatment assignments can be estimated 34
35. A Rasch model for longitudinal data Simulation study Common practice:
In a first step, obtaining individual level point estimates, maybe from one dimensional IRT modeling (MLE, WLE)
In a second step, analyzing change (as differences or residuals, using simple or complex SEM/HLM models),
In consequence,
the step two model will be based on point estimates with measurement error and not on a population model
Analysis results may be affected by “attenuation”
35
36. A Rasch model for longitudinal data Simulation study Looking at
distributions of the change score (means and SD),
correlations between t0 and change,
in a sort of realistic setup for our PISA assessment
How well do the different models,
the time points model
or the change model
reproduce the generating values?
How well does the two step procedure work? 36
37. 37 A Rasch model for longitudinal data Simulation results: group means notes:
r(t0;change) = 0 for this table, results very similar for other r(t0;change);
t0 (t1) and t1 (t2) values from the Andersen model
change values from Embretson model
38. 38 A Rasch model for longitudinal data Simulation results - group standard deviations notes:
r(t0;change) = 0 for this table, results very similar for other r(t0;change);
t0 (t1) and t1 (t2) values from the Andersen model
change values from Embretson model
39. 39
A Rasch model for longitudinal data Simulation results - correlation t0 and change
40. A Rasch model for longitudinal data Conclusions - Simulation Using PVs to analyse results of a combined latent regression latent change model gives unbiased results
for all statistics under investigation
The two step procedure
might be used to analyze time point mean values on aggregated levels,
should not be used to analyze distributions of changeor correlations with change and other variables. 40
41. A Rasch model for competency profiles Assessing student competencies in PISA
42. A Rasch model for competency profiles:German Educational Standards for Mathematics 42
43. A Rasch model for competency profiles: Test Construction – Competency Model 43
44. A Rasch model for competency profiles: Research Questions What differential information do we get on the students?
On which reporting scales can (group) profiles be based?
What about
the Overarching Ideas, the Competencies or
interactions of Overarching Ideas and Competencies? 44
45. A Rasch model for competency profiles: Multidimensional Models I – Variance & Reliability 45
46. A Rasch model for competency profiles: Interactions of Overarching Ideas and Competencies Why can’t the competencies measured with these items?
Do the competencies have different meanings within the 5 Overarching ideas?
Different meanings might indicate specific difficulties of the tasks for the competencies between different overarching ideas or might be due to the test construction (hopefully not)
For the next step, Competencies will be estimated within Overarching Ideas (in 5 runs on separate sets of items) 46
47. A Rasch model for competency profiles: Multidimensional Models II – Variance & Reliability 47
48. A Rasch model for competency profiles: Defining Interaction Models
Looking at the covariances of the Competencies within Overarching Ideas, different interaction models were derived:
a long Interaction Model with 19 dimensions
a reduced Interaction Model with 15 dimensions
a short Interaction Model with 11 dimensions
48
49. A Rasch model for competency profiles: Multidimensional Models III – Model Fit 49
50. A Rasch model for competency profiles: Short Interaction Model – Variance & Reliability 50
51. A Rasch model for competency profiles: Profile Analysis – Overarching Ideas 51
52. A Rasch model for competency profiles: Profile Analysis – Short Interaction Model 52
53. A Rasch model for competency profiles: Profile Analysis - Conclusions A higher degree of differential information on the students mathematical competence may be obtained from factors on interactions of Overarching Ideas and Competencies
Estimation of high dimensional models is a challenge
For this test, the assignment of competencies has to be reworked towards higher discriminating competencies
try the General Diagnosis Model (von Davier 2006) 53
54. Where are we today? Consequences of using Rasch models for educational assessment
55. Where are we today? The Rasch model is a commonly used measurement model in international educational assessment
It has been generalized in many respects, i.e.
to model heterogeneity in response data
or to combine measurement and data analysis into one model
With the increasing availability in non IRT specific software packages (SAS, GLAMM, M-Plus) the opportunities for specifying combined measurement and analysis models increase 55
56. Consequences of using Rasch models for educational assessment: Where are we today? Claus H. Carstensen IPN - Leibniz Institute for Science Educationat the University of Kiel, Germany
IRDP Neuchâtel, January 14, 2008