480 likes | 614 Views
Integrating Measurement and Sociocognitive Perspectives in Educational Assessment. Robert J. Mislevy University of Maryland. Robert L. Linn Distinguished Address
E N D
Integrating Measurement and Sociocognitive Perspectives in Educational Assessment Robert J. Mislevy University of Maryland Robert L. Linn Distinguished Address Sponsored by AERA Division D. Presented at the Annual Meeting of the American Educational Research Association, Denver, CO, May 1, 2010. This work was supported by a grant from the Spencer Foundation.
Messick, 1994 • [W]hat complex of knowledge, skills, or other attribute should be assessed... • Next, what behaviors or performances should reveal those constructs, and • what tasks or situations should elicit those behaviors?
Snow & Lohman, 1989 • Summary test scores, and factors based on them, have often been though of as “signs” indicating the presence of underlying, latent traits. … • An alternative interpretation of test scores as samples of cognitive processes and contents … is equally justifiable and could be theoretically more useful.
Roadmap • Rationale • Model-based reasoning • A sociocognitive perspective • Assessment arguments • Measurement models & concepts • Why are these issues important? • Conclusion
Rationale Measurement frame Sociocognitive frame
Rationale An articulated way to think about assessment: • Understand task & use situations in “emic” sociocognitive terms. • Identify the shift in to “etic” terms in task-level assessment arguments. • Examine the synthesis of evidence across tasks in terms of model-based reasoning. • Reconceive measurement concepts. • Draw implications for assessment practice.
Representational Form A Representational Form B y=ax+b (y-b)/a=x Mappings among representational systems Entities and relationships Real-World Situation Reconceived Real-World Situation Measurement models Measurement concepts
Measurement models Measurement concepts Entities and relationships in lower-level model Real-World Situation Reconceived Real-World Situation Representational Form A Representational Form B y=ax+b (y-b)/a=x Mappings among representational systems ReconceivedEntities and relationships in higher-level model Sociocognitive concepts
Some Foundations Themes from, e.g., cog psych, linguistics, neuroscience, anthropology: Connectionist metaphor, associative memory, complex systems (variation, stability, attractors) Situated cognition & information processing E.g., Kintsch’s Construction-Integration (CI) theory of comprehension; diSessa’s “knowledge in pieces” Interpersonal & Extrapersonal patterns
Some Foundations Extrapersonal patterns: Linguistic: Grammar, conventions, constructions Cultural models: What ‘being sick’ means, restaurant script, apology situations Substantive: F=MA, genres, plumbing, etc. Intrapersonal resources: Connectionist metaphor for learning Patterns from experience at many levels
not observable not observable Inside A A Inside B B observable
and internal and external aspects of context … Context A la Kintsch: Propositional content of text / speech… Inside A A Inside B B
Context The C in CI theory is Construction: Activation of bothrelevantandirrelevant bits from LTM, past experience. All L/C/S levels involved. Example: Chemistry problems in German. Inside A A Inside B B • If a pattern hasn’t been developed in past experience, it can’t be activated (although it may get constructed in the interaction). • A relevant pattern from LTM may be activated in some contexts but not others (e.g., physics models).
Context Inside A A Inside B B • The I in CI theory, Integration: • Situation model: synthesis of coherent / reinforced activated L/C/S patterns
Context Inside A A Inside B B Situation model is also the basis of planning and action.
Context Context Context Context Inside A A Inside B B
Context Inside A A Inside B B Context Context Ideally, activation of relevant and compatible intrapersonal patterns… Context
Context to lead to (sufficiently) shared understanding; i.e., co-constructed meaning. Inside A A Inside B B Context Context • Persons’ capabilities, situations, and performances are intertwined – • Meaning co-determined, through L/C/S patterns Context
What can we say about individuals? Use of resources in appropriate contexts in appropriate ways; i.e., Attunement to targeted L/C/S patterns: • Recognize markers of externally-viewed patterns? • Construct internal meanings in their light? • Act in ways appropriate to targeted L/C/S patterns? • What is the range and circumstances of activation? (variation of performance across contexts)
Messick, 1994 • [W]hat complex of knowledge, skills, or other attribute should be assessed... • Next, what behaviors or performances should reveal those constructs, and • what tasks or situations should elicit those behaviors?
Toulmin’s Argument Structure Claim unless Alternative explanation since Warrant so Backing Data
Concerns features of (possibly evolving) context as seen from the view of the assessor – in particular, those seen as relevant to targets of inference. Backing concerning assessment situation unless Alternative explanations on account of Evaluation of performance seeks evidence of attunement to features of targeted L/C/S patterns. Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Note the move from the emic to the etic! Choice in light of assessment purpose and conception of capabilities. Claim about student so Data concerning student performance Depends on contextual features implicitly, since evaluated in light of targeted L/C/S patterns. Student acting in assessment situation
“Hidden” aspects of context—not in test theory model but essential to argument: What attunements to linguistic cultural / substantive patterns can be presumed or arranged for among examinees, to condition inference re targeted l/c/s patterns? Backing concerning assessment situation unless Alternative explanations on account of Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Claim about student Fundamental to situated meaning of student variables in measurement models; Both critical and implicit. so Data concerning student performance Student acting in assessment situation
Backing concerning assessment situation Time unless Alternative explanations on account of Warrant concerning assessment since Data concerning task situation Unfolding situated performance Macro features of performance Micro features of performance Micro features of situation as it evolves Macro features of situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Features of context arise over time as student acts / interacts. Claim about student Features of performance evaluated in light of emerging context. so Data concerning student performance Especially important in simulation, game, and extended performance contexts (e.g., Shute) Student acting in assessment situation
Backing concerning assessment situation Claim about student unless Alternative explanations on account of Warrant concerning assessment since so Data concerning student performance Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Student acting in assessment situation Design Argument
unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation (Bachman) unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation
unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation (Bachman) unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student Claim about student is output of the assessment argument, input to the use argument. on account of How it is cast depends on psychological perspective and intended use. so When measurement models are used, the claim is an etic synthesis of evidence, expressed as values of student-model variable(s). Data concerning student performance Design Argument Student acting in assessment situation
unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation
unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation
unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation
Backing concerning assessment situation Claim about student unless Alternative explanations on account of Warrant for inference: Increased likelihood of activation in use situation if was activated in task situations. Warrant concerning assessment since so Data concerning student performance Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Student acting in assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation • What features do tasks and use situations share? • Implicit in trait arguments • Explicit in sociocognitive arguments Empirical question: Degrees of stability, ranges and conditions of variability (Chalhoub-Deville) Design Argument
Use situation features call for other L/C/S patterns that weren’t in task and may or may not be in examinee’s resources. • Target patterns activated in task but not use context. • Target patterns activated in use but not task context. • Issues of validity & generalizability • e.g., “method factors” • Knowing about relation of target examinees and use situations strengthen inferences • “bias for the best” (Swain, 1985) Backing concerning assessment situation Claim about student unless Alternative explanations on account of Warrant concerning assessment since so Data concerning student performance Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Student acting in assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation What features do tasks and use situations not have in common? Design Argument
Dp1 Dp1 Dp1 Ds2 Ds1 Dsn Dp1 Dpn Dp2 OI2 OI1 OIn An A2 A1 Claim about student … Multiple Tasks Synthesize evidence from multiple tasks, in terms of proficiency variables in a measurement model • Snow & Lohman’s sampling • What accumulates? L/C/S patterns, but variation • What is similar from analyst’s perspective need not be from examinee’s.
Measurement Models & Concepts AS IF • Tendencies for certain kinds of performance in certain kinds of situations expressed as student model variables q. • Probability models for individual performances (X) modeled as probabilistic functions of q – variability. • Probability models permit sophisticated reasoning about evidentiary relationships in complex and subtle situations, • BUT they are models, with all the limitations implied!
Measurement Models & Concepts • Xs result from particular persons calling upon resources in particular contexts (or not, or how) • Mechanicallyqs simply accumulate info across situations • Our choosing situations and what to observe drives their situated meaning. • Situated meaning of qs are tendencies toward these actions in these situations that call for certain interactional resources, via L/C/S patterns.
Claim about student … Dp1 Dp1 Dp1 Dsn Ds2 Ds1 Dp1 Dpn Dp2 OI1 OIn OI2 A2 A1 An t X Classical Test Theory • Probability model: “true score” = stability along implied dimension, “error” = variation • Situated meaning from task features & evaluation • Can organize around traits, task features, or both, depending on task sets and performance features. • Profile differences unaddressed
Claim about student … Dp1 Dp1 Dp1 Ds2 Dsn Ds1 Dpn Dp1 Dp2 OI1 OI2 OIn A2 A1 An q … X1 X2 Xn Item Response Theory • q = propensity to act in targeted way, bj=typical evocation, IRT function = typical variation • Situated meaning from task features & evaluation • Task features still implicit • Profile differences / misfit highlights where the narrative doesn’t fit – for sociocognitive reasons Complex systems concepts: Attractors & stability regularities in response patterns, quantified in parameters; Typical variation prob model Will work best when most nontargeted L/C/S patterns are familiar… Item-parameter invariance vs Population dependence (Tatsuoka, Linn, Tatsuoka, & Yamamoto, 1988)
Multivariate Item Response Theory (MIRT) • q s = propensities to act in targeted ways in situations with different mixes of L/C/S demands. • Good for controlled mixes of situations
Claim about student … Dp1 Dp1 Dp1 Dsn Ds1 Ds2 Dp1 Dpn Dp2 OI1 OI2 OIn A2 A1 An q … X1 q1 X2 Xn qn q2 vi1 vin vi2 Structured Item Response Theory • Explicitly model task situations in terms of L/C/S demands. Links TD with sociocognitive view. • Work explicitly with features in controlled and evolved situations (design / agents) • Can use with MIRT; Cognitive diagnosis models
Claim about student Claim about student … … OR Dp1 Dp1 Dp1 Dp1 Dp1 Dp1 Dsn Ds1 Dsn Ds2 Ds2 Ds1 Dpn Dp2 Dp1 Dp2 Dpn Dp1 OI1 OI2 OIn OI2 OI1 OIn A1 An A1 An A2 A2 q q … … X1 X1 X2 X2 Xn Xn Mixtures of IRT Models • Different IRT models for different unobserved groups of people • Modeling different attractor states • Can be theory driven or discovered in data
Measurement Concepts • Validity • Soundness of model for local inferences • Breadth of scope is an empirical question • Construct representation in L/C/S terms • Construct irrelevant sources of variation in L/C/S terms • Reliability • Through model, strength of evidence for inferences about tendencies, given variabilities … or about characterizations of variability.
Measurement Concepts • Method Effects • What accumulates in terms of L/C/S patterns in assessment situations but not use situations • Generalizability Theory (Cronbach) • Watershed in emphasizing evidentiary reasoning rather than simply measurement • Focus on external features of context; can be recast in L/C/S terms, & attend to correlates of variability
Why are these issues important? • Connect assessment/measurement with current psychological research • Connect assessment with learning • Appropriate constraints on interpreting large scale assessments • Inference in complex assessments • Games, simulations, performances • Assessment modifications & accommodations • Individualized yet comparable assessments
Conclusion Measurement frame Sociocognitive frame Communication at the interface We have work we need to do, together.