1 / 11

Validity in Testing

Validity in Testing. “Are we testing what we think we’re testing?”. Components of test validity. Construct validity – Are we measuring the theoretical language ‘construct’ that we claim to measure (general ones like ‘reading ability’, specific ones like ‘pronoun referent awareness’)

fzimmerman
Download Presentation

Validity in Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Validity in Testing “Are we testing what we think we’re testing?”

  2. Components of test validity • Construct validity – Are we measuring the theoretical language ‘construct’ that we claim to measure (general ones like ‘reading ability’, specific ones like ‘pronoun referent awareness’) • Content validity – if the test items are representative of all the skills or structures in question • Criterion-related validity – does the test give similar results to more thorough assessments?

  3. Content validity • The items test the targeted skill • The selection of items is appropriate for the skills (important skills have more items, some less important skills are not addressed) • Accurately reflects test specs • Requires a principled selection (but not based on ‘easy’ items to create/score)

  4. Criterion-related validity • Concurrent validity – the test and the other more thorough assessment occur at the same time • 100 students receive a 5 minute mini-interview to assess speaking skills; of that group 5 are given more complete 45-minute interviews. Do the scores for the 5 who did both match? (the degree of agreement is called the validity coefficient, a number between 0-1) • If yes, there is a degree of concurrent validity; if no, there is not.

  5. Validity Coefficients

  6. Criterion-related validity • Predictive validity – can the test predict future success? • Have you noticed if passing a test consistently means that a student will do well later on? • This can often be a subjective, anecdotal kind of measure, by teachers over a long period of time with many students • Reality check: there are MANY other factors that affect success (motivation, background knowledge, etc.)

  7. Limits on validity If you are testing directly, the issue of validity is less complicated. It’s usually pretty clear whether you are testing what you mean to test. But consider this: Exercise 9 Pronunciation – odd man out In the following lists of words, three words rhyme. Underline the one that is different • Creak steak squeak shriek • Spear wear cheer leer • Tomb boom broom bomb • Howl growl bowl prowl • Shed said raid tread

  8. Validity in scoring • How items are scored affects their validity • If the construct is reading, scores on short answers that deduct for punctuation and grammar are not valid • Measuring more than one construct in an item makes it more likely to be less valid How can you address this kind of problem?

  9. Face validity Teachers, students and administrators usually believe that: • A test of grammar should test… grammar! • A test of pronunciation requires… speaking! • A test of writing means that a student must… write! Do you agree?

  10. Making tests more valid • Write detailed test specs • Use direct testing whenever possible • Check that scoring is based only on the target skill • Keep tests reliable How important is it to check for validity?

  11. Considering Validity • Find a test given at UABC (if possible one you wrote, or one you are familiar with) • Make a judgment on its validity, considering content, criterion, scoring, and face validity. • Discuss this with several other people and present a short summary of the discussion to the whole group. John Bunting (2004) presentation in the Course: Testing, Assessment and Teaching- A program for EFL Teachers at UABC. Facultad de Idiomas, UABC

More Related