Validity

Validity EDUC 5535 Spring 2013

Standardized Tests: A Review • An artifact of the eugenics movement (in the 1920’s) - an attempt to sort people by their perceived intelligence or ability. • Short buzz: How are tests used to sort?

Examples • Sorting Purpose • Who goes to what college • Who is on grade level, who isn’t • Who is proficient in English/Who isn’t • Test • SAT • ITBS (Iowa Test of Basic Skills) • CELA

Validity and Reliability • Were created because it was important to know if the tests used for sorting were accurate sorting measures (e.g. Are we sure that the children we are putting in special education really need to be there? Are we sure the ELL children know enough English to achieve in English medium classrooms?)

Critical Components of Standardized Tests • Validity - Does the test measure what it purports to measure • Construct • Content • Concurrent - criterion • Predictive • Consequential • More importantly - threats to validity and reliability

Content Validity • The extent to which an assessment procedure adequately represents the content of the curricular aim being measured. • How do we determine content validity? • Expert review (domain experts) • External reviewers (to check on what the experts created) • Expert Panel

Social Studies Exam • 1. Which of these countries is not in South America? • Brazil • Canada • Argentina • Venezuela • 2. How many continents are there? • Threats to validity? • Reading and writing skill (if you do not read well you may not do well on the test) • Cultural differences

Higher order thinking skills (analogy) • Red is to firetruck as _______________is to lemons. • Tortillas are to ________________ as _____________ is to a toaster. • Threats to validity – no OTL analogy, cultural issues, reading level.

This is a bureau: What is missing?

Garcia (2005) • Every test in English is first and foremost a test of English. • Every paper/pencil test no matter what they content is first and foremost a test of literacy. • All tests have inherent cultural, linguistic and economic biases

Construct related validity • The extent to which empirical evidence confirms that an inferred construct exists and that a given assessment procedure is measuring the inferred construct accurately. • Construct validity is also determined by: • Content experts • External reviewers • Expert panels

What constructs are needed to be a good writer? • Content • Conventions • Spelling • Genre • Audience • A writing assessment that has construct validity has all of the above constructs • What might be constructs related to being a good reader?

Answer the following: • Which title should be underlined? • America the Beautiful • Gone with the Wind • Damn Yankees • Threats to construct validity? • Child knows the rule for underlining but does not know which of the above is a song, book or play

Reading Comprehension • Read the text provided to you • Answer the comprehension questions • Final comprehension question: Choose a title for this essay. • Scoring: • Final question weighted more heavily because it is an inferential question • Threats to validity?

Concurrent Validity • Do outcomes from one assessment correlate positively to another assessment that purports to measure the same constructs? • Concurrent Validity is Measured by comparing outcomes of one assessment to another (e.g. compare LAS to CELA; ITBS to CSAP; SAT to ACT) look for correlations above .50

Predictive Validity • Does the test/assessment predict future performance or behavior? • Do SATs predict preparedness for college? • Do GREs predict preparedness for graduate school? • Do 3rd grade reading test scores predict who will struggle in high school? • Do school readiness test predict who is ready for kindergarten? • Does DIBELS predict reading comprehension?

Consequential Validity • What are the consequences of outcomes on the assessment? Are the outcomes used the way the test creators intended? • Threats to validity • Tests given to a population they were not intended to be given to (e.g. CSAP to ELLs) • Tests used for unintended purposes (e.g. to rank schools and deem some ‘good’ and others ‘bad’

Threats to Validity of Tests for L2 Students • Tests measure language skills of students and not knowledge of content • Tests contain economically, linguistically and culturally biased items • Tests were created for one population and given to another (e.g. ITBS or CSAP with L2 students) • Modifications of tests for L2 are inconsistent • Improperly trained people are administering the tests (e.g. paraprofessionals administering ACCESS)

Miscellaneous Notes on Validity and Reliability • Invalid tests are often highly reliable • The consistent use of invalid tests creates the façade of an achievement gap

Shepard • Shepard discusses uses and abuses of tests (these are validity issues) • In your group identify what you think are the 3 most salient to YOUR work • Do these match practice in your district/school? • What should we do?

Validity

Validity

Presentation Transcript

Validity

Validity

Validity

Validity

Validity

Validity

Validity. Test Validity & Experiment Validity.

Validity

VALIDITY

VALIDITY

Validity

Content Validity Face Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Presentation Transcript

Validity

Validity

Validity

Validity

Validity

Validity

Validity. Test Validity &amp; Experiment Validity.

Validity

VALIDITY

VALIDITY

Validity

Content Validity Face Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity

Validity. Test Validity & Experiment Validity.