1.05k likes | 1.18k Views
The Tortured History of Reading Comprehension Assessment. P. David Pearson UC Berkeley Professor and Former Dean. Former. Slides available at www.scienceandliteracy.org. ___________ __________ __________ ___________ __________ _________ _________. The guiding questions….
E N D
The Tortured History of Reading Comprehension Assessment P. David Pearson UC Berkeley Professor and Former Dean Former Slides available at www.scienceandliteracy.org
___________ • __________ • __________ • ___________ • __________ • _________ • _________
The guiding questions… • Are there lessons from the past? • Is there hope for the future? • Will we ever get it right?
My predictions • Are there lessons from the past? • Is there hope for the future? • Will we ever get it right? MAYBE YES MAYBE MAYBE MAYBE NO
Why me? • Have lived the tensions in comprehension assessment, but • An unwitting and unwilling candidate for the task? • Comprehension assessment was on the way to more important goals.
Mea Culpa • Those whose work I omit • Those whose work I DON’T omit
Why now? • Post-NCLB uneasiness among practitioners that the code, as important as it is, may not be the point of reading • New Rhetorical Moves • Deeper Learning • 21st Century Skills • Reading-Writing Connections • Common Core Standards • Read like a detective; write like a reporter • Common Core Standards: The substance • New R&D: Reading for Understanding • New psychometric tools
Why now? More… • National thirst for accountability requires impeccable measures (both conceptually and psychometric) • When the stakes are high, so too must be the standards • Pleas of teachers desperate for useful tools • Comprehension analogue of running records of oral reading as indices of fluency and accuracy
The real need… • Theoretically elegant, yes • Even more need for everyday monitoring tool
LESSONS FROM THE PAST • There are many • A few of my favorites
Lesson #1: No matter how hard we try, we never “see” reading comprehension.
artifacts edzucators
If we never see the click of comprhension… • What criterion do we adopt as our gold standard for determining the validity the various markings we have to live with? • Cognitive interview • Illinois Goal Assessment Program • NAEP and most other assessments • The immediate apprehension of understanding • The sense of getting it! • Self ratings and think alouds • Criterion variables worth predicting
Lesson #2: No matter how novel you think your idea is, you can find a precedent for it that is at least a century old
I know! Let’s use open ended performance items on our high stakes state test!
Tidbits from the 1919 Wisconsin State High School Reading Exam • Write a few memory gems from your literature experiences in school. • Name three novels you have read this year and give the plot of each. • What is the significance of the the “letter” in Hawthorne’s The Scarlet Letter? • Define these terms: poetic license, flashback, and simile.
So as long as we are forced to use MC formats, let’s have the students pick more than one right andwer!
A curious example from early 1900s “Every one of us, whatever our speculative opinion, knows better than he practices, and recognizes a better law than he obeys.” Check two of the following statements with the same meaning as the quotation above. • To know right is to do the right. • Our speculative opinions determine our actions. • Our deeds often fall short of the actions we approve. • Our ideas are in advance of our everyday behavior. From Thurstone, undated circa 1910
Chapman 1924 • Find the statements in Part 2 of the paragraph that don’t fit the statements in Part 1…
I know! Let’s do think alouds to get at what students are really doing while they answer the questions.!
Touton and Berry (1931) Error analyses (a) failure to understand the question (b) failure to isolate elements of “an involved statement” read in context (c) failure to associate related elements in a context (d) failure to grasp and retain ideas essential to understanding concepts (e) failure to see setting of the context as a whole (f) other irrelevant answers
Lesson #3: Grain size really matters in reading assessment, even within comprehension assessment
The Scene in the US in the 1970s and early 1980s • Behavioral objectives • Mastery Learning • Criterion referenced assessments • Curriculum-embedded assessments • Minimal competency tests: New Jersey • Statewide assessments: Michigan & Minnesota Slides available at www.scienceandliteracy.org
Teach Assess Conclude Teach Assess Conclude Historical relationships between instruction and assessment Skill 1 Skill 2 The 1970s Skills management mentality: Teach a skill, assess it for mastery, reteach it if necessary, and then go onto the next skill. Foundation: Benjamin Bloom’s ideas of mastery learning
Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Skill 1 The 1970s, cont. And we taught each of these skills until we had covered the entire curriculum for a grade level. Skill 2 Skill 3 Skill 4 Skill 5 Skill 6
Sue’s grandmother lives on a farm. Ellen’s grandmother lives in the city. Sue’s grandmother, who just turned 55, phones Sue every month. Ellen’s grandmother, who is also 55, sends Ellen e-mails several times a week. Both grandmothers love their granddaughters. • How are Sue and Ellen’s grandmothers alike? • They both love their granddaughters • They both use e-mail • They both live on a farm • How are they different? • They live in different places • They have different color hair • They are different ages I wrote this item for Ginn Basic Reading Program in 1981
Fast Forward to 2002 • These specific skill tests have not gone away • Today’s standards are yesterday’s objectives or skills
Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude Teach Assess Conclude /standard Skill 1 The 2000s And we taught each of these standards until we had covered the entire curriculum for a grade level. /standard Skill 2 Skill 3 /standard /standard Skill 4 /standard Skill 5 /standard Skill 6
A word about benchmark assessments… • The world is filled with assessments that provide useful information… • But are not worth teaching to • They are good thermometers or dipsticks • Not good curriculum
The ultimate assessment dilemma… • What do we do with all of these timed tests of fine-grained skills: • Words correct per minute • Words recalled per minute • Letter sounds named per minute • Phonemes identified per minute • Scott Paris: Constrained versus unconstrained skills • Pearson: Mastery constructs versus growth constructs
Why they are so seductive • Mirror at least some of the components of the NRP report • Correlate with lots of other assessments that have the look and feel of real reading • Takes advantage of the well-documented finding that speed metrics are almost always correlated with ability, especially verbal ability. • Example: alphabet knowledge • 90% of the kids might be 90% accurate but… • They will be normally distributed in terms of LNPM
How to get a high correlation between a mastered skill and something else Letter Name Fluency (LNPM) Letter Name Accuracy The wider the distribution of scores, the greater the likelihood of obtaining a high correlation with a criterion
Face validity problem: What virtue is there in doing things faster? • naming letters, sounds, words, ideas • What would you do differently if you knew that Susie was faster than Ted at naming X, Y, or Z???
They meet only one of tests of validity: criterion-related validity • correlate with other measures given at the same time--concurrent validity • predict scores on other reading assessments--predictive validity
Fail the test of curricular or face validity • They do not, on the face of it, look like what we are teaching…especially the speeded part • Unless, of course, we change instruction to match the test
Really fail the test of consequential validity • Weekly timed trials instruction • Confuses means and ends • Proxies don’t make good goals
The Achilles Heel: Consequential Validity Give DIBELS Give Comprehension Test Use results to craft instruction Give DIBELS again Give Comprehension Test The emperor has no clothes
Collateral Damage • Tight link between instruction and assessment • Assess at a low level of challenge • Basic Skills Conspiracy • First you have to get all the words right and all the facts straight before you can do the what ifs and I wonder whats.
The bottom line on so many of these tests Never send a test out to do a curriulum’s job!
Lesson #4: It is very difficult to oust the incumbent • Two mini-case studies • Unconventional state assessments • Performance assessments
Valencia and Pearson (1987) Reading Assessment: Time for a Change. In Reading Teacher