1 / 10

Data Collection and Score Interpretation

Dr. Jeffrey Oescher 27 January 2014. Data Collection and Score Interpretation. Technical Issues. Two technical issues Validity Reliability. Technical Issues. Validity – the extent to which inferences made on the basis of scores from an instrument are appropriate, meaningful, and useful

kobe
Download Presentation

Data Collection and Score Interpretation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dr. Jeffrey Oescher 27 January 2014 Data CollectionandScore Interpretation

  2. Technical Issues • Two technical issues • Validity • Reliability

  3. Technical Issues • Validity – the extent to which inferences made on the basis of scores from an instrument are appropriate, meaningful, and useful • Characteristics • Refers to the interpretation of the results • Is a matter of degree • Is situation specific • Is a unitary concept • Involves an overall judgment

  4. Data Collection – Technical Issues • Validity evidence • Content • Face • Content • Construct • Criterion-related • Predictive • Concurrent • Situationally specific

  5. Data Collection – Technical Issues • Reliability • The extent to which scores are free from error • Error is measured by consistency • Two perspectives • Test – the reliability of a test • Agreement – the reliability of an observation

  6. Data Collection – Technical Issues • Test reliability evidence • Stability • Also known as test-retest • Measured on a scale of 0 to1 • Equivalence • Also known as parallel forms • Measured on a scale of 0 to 1 • Internal consistency • Split half • KR 20 • KR 21 • Cronbach alpha • All measured on a scale from 0 to 1

  7. Data Collection – Technical Issues • Score reliability evidence • Standard error of measurement or SEM • A statistic that allows one to ascertain the probability that a student’s score falls within a given range of scores • Usually reported as the student’s score and ‘SEM = +/- 2.25’ • You can add and subtract one (1) SEM to a student’s score and be confident that their score fall within that range of scores 68% of the time • You can add and subtract two (2) SEM to a students score and be confident that their score falls with that range of scores 99% of the time • Agreement reliability evidence • Percentage of agreement between observers • More commonly known as inter-rater reliability • Ranges on a scale from 0 to 1

  8. Score Interpretation • Two types of interpretations: criterion-referenced and norm-referenced • Criterion-referenced • You need to know the underlying scale (e.g., 0-100, 1-5, etc.) upon which the scores are based • The interpretation of the test score is made relative to this underlying scale • The scores indicted the students mastered about three-fourths of the objectives • The scores are interpreted relative to what the students know • The scores easily communicate some level of performance (e.g., good, bad, moderate, etc.)

  9. Score Interpretation • Norm-referenced • You need to know the reference group (i.e., norming sample) against which the scores are being compared • The interpretation of test scores is made in relation to the scores of students in the norming group • John’s score put him in the 85th percentile • John’s score indicates he performed better than 85% of the students in the norming group • John’s score doesn’t tell us anything about what John knows in terms of content

  10. Score Interpretation • A note of caution • Which of the following represents a criterion-referenced and norm-referenced interpretation? • The scores for the experimental group were significantly higher than those for the control group. • The scores for the experimental group indicated mastery of about 95% of the objectives, while those scores for the control group indicated only 65% mastery. • These are common examples from the literature you will be reading • Be careful about the first interpretation, as it only tells us which group is better. It does not tell us how well either group performed.

More Related