760 likes | 904 Views
Developing High Quality Items and Assessments… On PURPOSE!. Keystone Item and Test Development IU1 – November 27, 2012. AGENDA. Keystone blueprints & Keystone results Webb’s Depth of Knowledge Multiple-choice item writing rules Practice writing multiple-choice items LUNCH
E N D
Developing High Quality Items and Assessments…On PURPOSE! Keystone Item and Test Development IU1 – November 27, 2012
AGENDA • Keystone blueprints & Keystone results • Webb’s Depth of Knowledge • Multiple-choice item writing rules • Practice writing multiple-choice items • LUNCH • Constructed-response item writing rules • Practice writing constructed-response items
Why? • Understand the quality and rigor of the Keystone exams • Many districts are developing their own Keystone-like items and assessments
Two Important Questions – Purpose and Content • What’s the purpose of the test? • What are you going to do with the results? • What decision(s) will be made based on the results? • Prepare students for Keystone-like items. • What’s the content of the test? • Algebra I Assessment Anchors • Biology Assessment Anchors • English Literature Assessment Anchors • At a higher cognitive complexity (Depth of Knowledge)
Defining the Purpose “A test is only reliable and valid for the purpose for which it was designed.” Shula Nedley, Ph.D.
Introducing….the Test Blueprint • The items must be comparable to the items on the Keystones. • Therefore, we need to write items to match the Keystone specifications. • Each Keystone has two Modules • Each Keystone has both multiple-choice and constructed-response items.
Algebra I Keystone – from PDE documents - part of the Blueprint 60% 40%
Biology Keystone – from PDE documents - part of the Blueprint 73% 27%
Literature Keystone – from PDE documents - part of the Blueprint 65% 35%
Webb’s Depth of Knowledge Webb & Bloom DoKvs Difficulty
Webb’s Depth of Knowledge • Webb developed a process and criteria for systematically analyzing the alignment between standards and assessments. • Webb’s Depth of Knowledge Model can be used to analyze the cognitive complexity required for the student to master a standard or complete an assessment task.
Webb’s Depth of Knowledge • Cognitive complexity refers to the cognitive demand associated with a test question. • The Depth of Knowledge level of the question is determined by the complexity of the mental processing that the student must use to answer the question.
Webb’s four levels of complexity 1. Recall - Recall of a fact, information, or procedure 2. Basic Application of Skill/Concept - Use of information, conceptual knowledge, procedures, two or more steps, etc. 3. Strategic Thinking - Requires reasoning, developing a plan or sequence of steps; has some complexity; more than one possible answer; generally takes less than 10 minutes to do. 4. Extended Thinking - Requires an investigation; time to think and process multiple conditions of the problem or task; and more than 10 minutes to do non-routine manipulations.
But….don’t get stuck on the verbs • Level 2 items are measuring more than one skill • What are the skills this item is measuring? • Level 3 items go even deeper and measure more than two skills. Multiple skills must be known to correctly answer the item. • In PA – Level 3 DOK are measured primarily with Constructed R items. • In PA – Level 4 DOK are long term projects.
Same Verb—Three Different DOK Levels DOK 1-Describe three characteristics of metamorphic rocks. (Requires simple recall) DOK 2-Describe the difference between metamorphic and igneous rocks. (Requires cognitive processing to determine the differences in the two rock types) DOK 3-Describe a model that you might use to represent the relationships that exist within the rock cycle. (Requires deep understanding of rock cycle and a determination of how best to represent it)
DOK ≠ Difficulty The word illeist means a person who: • refers to themselves in the 3rd person. • plays a specialized musical instrument. • is sick. • loves animals. Difficulty: Easy, Medium, Hard? DOK: 1, 2, 3?
DOK ≠ Difficulty In Mr. Bell’s classes, the students voted for their favorite shape for a symbol. Here are the results. Using the information in the chart, Mr. Bell must select one of the shapes to be the symbol. Which one should he select and why? The shape Mr. Bell should select: _________________ Explain: _________________________________________ Difficulty: Easy, Medium, Hard? DOK: 1, 2, 3?
DOK ≠ Difficulty By inserting a gene into crop plants, scientists have developed plants that are resistant to insects. If an insect eats the plant, the insect dies. Which practice is unnecessary with this new plant variety? • Eroding the land by tilling • Overproducing food crops • Removing weeds from crops • Spraying plants with pesticides Difficulty: Easy, Medium, Hard? DOK: 1, 2, 3?
Test Your Depth of Knowledge You are trapped in a room with 2 doors, one of which leads to freedom and the other to certain death. You do not know which is which. Each door is guarded by a man, one of whom always lies, while the other always tells the truth. You do not know which is which. You can ask but one question of either man to gain your freedom. Which question should you ask? • How do I get out of this room and stay alive? • Which one of the two door is the correct one? • Who is the man who never tells the truth & always lies? • Which door would the other man say leads to death?
Identifying “good” items Best practices for writing/reviewing multiple-choice items
The Test of Franzipanics Who speaks Franzipanics?
The Test of Franzipanics • Take 5 minutes and try to answer each question, individually. • Then, take another 5 minutes and share your answers at your table.
Good Item Writing RulesWhat did we learn from Franzipanics? 1. The purpose of the cluss in furmpaling is to remove a. cluss-prags b. tremalis c. cloughs d. plumots Rule - Structure: Do NOT use a ‘cue word’ in the stem and the keyed (correct) answer.
Good Item Writing RulesWhat did we learn from Franzipanics? 2. Trassig is true when a. lusptrasses the vom b. the viskal flans, if the viskal is donwil or zortil c. the belgofrulls d. dissleslisk easily Rule - Structure: Make distractors of equal length.
Good Item Writing RulesWhat did we learn from Franzipanics? 3. The sigla frequently overfesks the trelsum because a. all siglas are mellious b. siglas are always votial c. the trelsum is usually tarious d. no trelsa are feskable Rule - Structure: Do NOT use absolutes such as ‘all’, ‘always’, ‘no’. Stick with ‘usually’, ‘frequently’.
Good Item Writing RulesWhat did we learn from Franzipanics? 4. The fribbled breg will minter best with an a. derst b. morst c. sorter d. ignu Rule - Structure: No grammatical hints such as ‘an’.
Good Item Writing RulesWhat did we learn from Franzipanics? 5. Among the reasons for tristal doss are a. the sabs foped and the foths tinzed b. the kredges roted with the orots c. few rakobs were accepted in sluth d. most of the polats were thonced Rule - Structure: No grammatical hints such as ‘are’ or ‘is’ which suggests plural or singular.
Good Item Writing RulesWhat did we learn from Franzipanics? 6. Which of the following (is, are) always present when trossels are being gruven? a. rint and vost b. sot and plone c. shum and vost d. vost Rule - Structure: No hints in the distractors. Repeating one word throughout the distractors provides a hint.
Good Item Writing RulesWhat did we learn from Franzipanics? 7. The mintering function of the ignu is most effectively carried out in connection with a. raxmatol b. the groshingstantol c. the fribbledbreg d. a frallysush Rule - Structure: Students’ opportunity to correctly answer an item must be independent of their performance on another item. Do NOT link items.
Good Item Writing RulesWhat did we learn from Franzipanics? 8. a. b. c. d. Rule - Structure: Do NOT create patterns in your answer key.
The moral of this ‘test’… • ……with poorly written items, you can get them correct without knowing the intended content. • Reliability issue? • Validity issue?
BREAK TIME! 9:50 – 10:00
The Anatomy of the Multiple Choice Item Why are we writing items? • To gauge student learning. • To receive Act 48 hours. • To work with other adults. • To avoid my students. The Stem CORRECT ANSWER Distractor Distractor Distractor
Multiple-choice Items • The advantage is that, with careful construction, this type can be used to measure knowledge at most levels. • The disadvantage is that it's hard to write good distractors for levels beyond factual recall
Multiple-Choice Item Writing GuidelinesRules regarding Structure: • Multiple choice items will have four response options and only one correct answer. • Response options should be parallel in reference to parts of speech. • Response options should not overlap.
Response options should be parallel in reference to parts of speech. EXAMPLE Why did the chicken cross the road? • To get to the other side. • Walking then running through traffic to catch the bus. • To demonstrate proficiency in barnyard standard 3.1. • Chicken feed on the other side.
Response options should not overlap. EXAMPLE The average combined score in the NFL playoff games in 2010 was: • <10 • <20 • >40 • >50 Less than 10 About 20 About 40 More than 50
Multiple-Choice Item Writing GuidelinesRules regarding Structure: • Avoid absolutes such as “only” or “never”. • If the distractors are numeric, put them in either ascending or descending order. If the distractors are alphabetic and relatively short (one to three words), put them in alphabetical order. • Response options should not include: ‘None of the above’, ‘All of the above’, ‘not enough information’, ‘Cannot be determined’.
Multiple-Choice Item Writing GuidelinesRules regarding Structure: • Don’t repeat the same word(s) at the beginning of each response – add the word(s) to the stem. • The item stem should be a complete thought. The easiest way to tell if the stem is a complete thought is to cover up the response options and see if you know what you’re supposed to do. Students should not have to figure out what you’re asking.
Multiple-Choice Item Writing GuidelinesRules regarding Content: • Each multiple-choice item should be written to measure only one eligible content statement. • Items should be clear and concise, and they should use vocabulary and sentence structure appropriate for the assessed grade level. • Distractors should be incorrect but plausible based on the topic of the question. • Ensure that there is only one true and defensible answer.
Multiple-Choice Item Writing GuidelinesRules regarding Content: • Avoid jargon and textbook language. • Avoid clichés. • Use common misinformation purposefully. • Use logical misinterpretations purposefully.
Multiple-Choice Item Writing Guidelines • Get your thoughts down on paper as soon as possible. Then edit. • Ensure that there is only one true and defensible answer. • Have someone else review your work.
We all have our unique skills and talents • Some of us are good item writers • Some of us are good item reviewers and editors