280 likes | 465 Views
Larry D. Gruppen, Ph.D. University of Michigan. From Concepts to Data:. Conceptualization,. Operationalization, and. Measurement. in Educational Research. Objectives. Identify key research design issues Wrestle with the complexities of educational measurement
E N D
Larry D. Gruppen, Ph.D. University of Michigan From Concepts to Data: Conceptualization, Operationalization, and Measurement in Educational Research
Objectives • Identify key research design issues • Wrestle with the complexities of educational measurement • Explain the concepts of reliability and validity in educational measurement • Apply criteria for measurement quality when conducting educational research
Agenda • A brief nod to design • From theory to measurement • Criteria for measurement quality • Reliability • Validity • Application: analyze an article
Guiding Principles forScientific Research in Education • Question: pose significant question that can be investigated empirically • Theory: link research to relevant theory • Methods: use methods that permit direct investigation of the question • Reasoning: provide coherent, explicit chain of reasoning • Replicate and generalize across studies • Disclose research to encourage professional scrutiny and critique
Study design • Study design consists of: • Your measurement method(s) • The participants and how they are assigned • The intervention • The sequence and timing of measurements and interventions
Comparison Group • Pre-post design - compare intervention group to itself • Non-equivalent control group design - compare intervention group to an existing group • Randomized control group design - compare to equivalent controls
Overview of Study Designs • Symbols • Each line represents a group. • x = Intervention (e.g. treatment) • O1, O2, O3…= Observation (measurement) at Time 1, Time 2, Time 3, etc. • R = Random assignment
x O1 One-Group Posttest x O1
x O1 O1 Posttest-Only Control Group
O1 x O2 One-Group Pretest-Posttest
O1 x O2 O1 O2 Control Group Pretest-Posttest
R x O1 R O1 Posttest Only Randomized Control Group
R O1 x O2 R O1 O2 Randomized Control Group Pretest-Posttest
Theory Constructs Operational Definition Measurement From Theory to Measurement
Measurement • Measurement: assignment of numbers to objects or events according to rules • Quality: reliability and validity
The Challenge of Educational Measurement • Almost all of the constructs we are interested in are buried inside the individual • Measurement depends on transforming these internal states, events, capabilities, etc. into something observable • Making them observable may alter the thing we are measuring
Examples of Measurement Methods • Tests (knowledge, performance): defined response, constructed response, simulations • Questionnaires (attitudes, beliefs, preferences): rating scales, checklists, open-ended responses • Observations (performance, skills): tasks (varying degrees of authenticity), problems, real-world behaviors, records (documents)
Reliability • Dependability (consistency or stability) of measurement • A necessary condition for validity
Types of Reliability • Stability (produces the same results with repeated measurements over time): • Test-retest • Correlation between scores at 2 times • Equivalence/Internal Consistency (produces same results with parallel items on alternate forms): • Alternate forms; split-half; Kuder-Richardson; Chronbach’s alpha • Correlation between scores on different forms; Calculate coefficient alpha (a) • Consistency (produces the same results with different observers or raters): • Inter-rater agreement • Correlation between scores from different raters; kappa coefficient
Validity • Refers to the accuracy of inferences based on data obtained from measurement • Technically, measures aren’t valid, inferences are • No such thing as validity in the abstract: the key issue is ‘valid’ for what inference • Want to reduce systematic, non-random error • Unreliability lowers correlations, reducing validity claims
Conventional View of Validity • Face validity: logical link between items and purpose—makes sense on the surface • Content validity: items cover the range of meaning included in the construct or domain. Expert judgment • Criterion validity: relationship between performance on one measurement and performance on another (or actual behavior) Concurrent and Predictive Correlation coefficients • Construct validity: directly connect measurement with theory. Allows interpretation of empirical evidence in terms of theoretical relationships. Based on weight of evidence. Convergent and discriminant evidence. Multitrait-MultiMethod Analysis (MTMM)
Unified View of Construct Validity(Messick S, Amer Psych, 1995) • Validity is not a property of an instrument but rather of the meaning of the scores. Must be considered holistically. • 6 Aspects of Construct Validity Evidence • Content—content relevance & representativeness • Substantive—theoretical rationale for observed consistencies in test responses • Structural—fidelity of scoring structure to structure of construct domain • Generalizability—generalization to the population and across populations • External—convergent and discriminant evidence • Consequential—intended and unintended consequences of score interpretation; social consequence of assessment (fairness, justice)
Finding Measurement Instruments • Scan the engineering education literature (obviously) • Email engineering ed researchers (use the network) • Examine literature for instruments used in prior studies • General education/social science instrument databases • Buros Institute of Mental Measurements (Mental Measurement Yearbook, Tests in Print) http://buros.unl.edu/buros/jsp/search.jsp • ERIC databases http://www.eric.ed.gov/ • Educational Testing Service Test Collection http://www.ets.org/testcoll/index.html • Construct your own (last resort!) • Get some expert consultation (test writing, survey design, questionnaire construction, etc.)
Example • In your groups, analyze the Steif & Dantzler statics concept inventory article. Look for: • Theoretical framework • Constructs used in the study • How constructs were operationalized • Measurement process • Attention to reliability and validity
References • Campbell DT, Stanley JC. Experimental and quasi-experimental designs for research. Chicago: Rand McNally; 1969. • Cook, T.D. and Campbell, D.T. (1979). Quasi-Experimentation: Design and Analysis for Field Settings. Rand McNally, Chicago, Illinois. • Messick S. Validity of psychological assessment: validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist. 1995;50:741-749. • Messick S. Validity. In: Linn RL, ed. Educational measurement. 3rd ed. New York: American Council on Education & Macmillan; 1989:13-103.