140 likes | 153 Views
This article provides an overview of developing effective assessments, focusing on the use of well-constructed multiple-choice items in geoscience assessments. It discusses the reliability and validity of assessments, the disadvantages and advantages of multiple-choice tests, and the considerations for item construction and scoring. The article also explores how multiple-choice items can be useful for community tools, such as measuring misconceptions and guiding instruction.
E N D
The Effective Use of Well-Constructed Multiple-Choice Items in Geoscience Assessments Part 1: Overview Mimi Fuhrman, American Institutes for Research
How to develop effective assessments? • Assessments should be RELIABLE and VALID • Reliability: A reliable assessment is one that provides consistent “scores”. • Validity: An assessment is valid when it accurately measures what it purports to measure.
Multiple-choice test disadvantages: • If poorly designed, often limited to measuring simple recall or trivia • Not appropriate for measuring ability to synthesize ideas, to write effectively, or to perform certain types of problem-solving operations • Difficulty in making the assessment “authentic”
Example of an “authentic” MC item A stratigrapher observes a series of cyclothems, each between 5 and 20 meters thick consisting of, from top to bottom: Top Red, bioturbated shales with small, irregular, calcite nodules Thin-bedded, ripple-laminated, fine-grained sandstone Thick-bedded, medium- to coarse-grained sandstones with unimodal trough cross-bedding Bottom Erosion surface overlain by coarse, pebbly sandstone What depositional environment is represented? • Eolian • Littoral • Alluvial • Deltaic • Neritic
Multiple-choice test advantages • High reliability; good sampling of content • Ability to obtain a wide range of scores • Easier to frame questions so that all examinees will address themselves to the same problem • Low cost (time)
Short answer vs. Multiple-choice • 15 short answer questions administered followed immediately by identical questions , multiple-choice. Scores were 43% higher on multiple- choice version. (from Shea, 1992) “What portion of a parent isotope remains after four half lives have elapsed?”
Short answer responses: 15 Daughter isotope Nucleus Uranium 10 million years Multiple-choice options A. one-fourth B. one-eighth C. one-sixteenth D. virtually none What portion of a parent isotope remains after four half lives have elapsed?
Scoring IS an issue for multiple-choice: • Free- or constructed response • Use of rubrics for holistic or analytical scoring • Most time investment AFTER administration • Multiple-choice • Options determine what is correct and what is incorrect • Scoring decisions and major time investment when the items are developed, BEFORE administration
What makes a “Good” Multiple-choice item? • Alignment: to content, learning outcomes/goals, cognitive demand, difficulty • Importance: testing trivia is a waste of resources; concentrate on fundamental concepts • Clarity: the intent of the task and the meaning of the options must speak for themselves and be interpreted in the same way by all examinees • Item construction: well-constructed items can be answered successfully by examinees who have the knowledge or skill you are testing, and NOT by examinees who are lacking the skill or knowledge
Performance Data: Example item from 8th grade test How does most sand get to the ocean shore? A) Boulders near the ocean break down into sand and are deposited on the shore. B) Rocks break down into smaller pieces as they are transported by rivers to the ocean shore.* C) Rocks dissolve in rivers and then crystallize as sand when the river reaches the ocean shore. D) Wave action breaks down ocean floor rock and pushes the resulting sand onto the ocean shore. Performance Data: percent answered biserial 1 A) 9.0953 -0.0875 B) 16.6909* -0.0560* C) 17.9971 -0.0291 D) 48.0890 + 0.2290 1: biserial = correlation between performance on item and performance on test
Consideration for Community Tools • Requirements for community use --- use across classrooms, instructors, institutions • Agreement about learning goals • Context/scenarios should be equally familiar for all examinees • Vocabulary/terminology should be standard or clearly explained • Maximize usefulness by sharing performance data
How multiple-choice items may be useful for Community Tools • If the community can agree on a set of common learning goals --- “core” content/concepts/skills • MC instruments can be used to measure how widespread misconceptions are • Results can guide instruction, on a local level AND in the community (compare with Force Concept Inventory used in physics) • The development of the instruments/items themselves can serve as a focus the community: agreement = importance