100 likes | 190 Views
Dr. Jeffrey Oescher 27 January 2014. Data Collection and Score Interpretation. Technical Issues. Two technical issues Validity Reliability. Technical Issues. Validity – the extent to which inferences made on the basis of scores from an instrument are appropriate, meaningful, and useful
E N D
Dr. Jeffrey Oescher 27 January 2014 Data CollectionandScore Interpretation
Technical Issues • Two technical issues • Validity • Reliability
Technical Issues • Validity – the extent to which inferences made on the basis of scores from an instrument are appropriate, meaningful, and useful • Characteristics • Refers to the interpretation of the results • Is a matter of degree • Is situation specific • Is a unitary concept • Involves an overall judgment
Data Collection – Technical Issues • Validity evidence • Content • Face • Content • Construct • Criterion-related • Predictive • Concurrent • Situationally specific
Data Collection – Technical Issues • Reliability • The extent to which scores are free from error • Error is measured by consistency • Two perspectives • Test – the reliability of a test • Agreement – the reliability of an observation
Data Collection – Technical Issues • Test reliability evidence • Stability • Also known as test-retest • Measured on a scale of 0 to1 • Equivalence • Also known as parallel forms • Measured on a scale of 0 to 1 • Internal consistency • Split half • KR 20 • KR 21 • Cronbach alpha • All measured on a scale from 0 to 1
Data Collection – Technical Issues • Score reliability evidence • Standard error of measurement or SEM • A statistic that allows one to ascertain the probability that a student’s score falls within a given range of scores • Usually reported as the student’s score and ‘SEM = +/- 2.25’ • You can add and subtract one (1) SEM to a student’s score and be confident that their score fall within that range of scores 68% of the time • You can add and subtract two (2) SEM to a students score and be confident that their score falls with that range of scores 99% of the time • Agreement reliability evidence • Percentage of agreement between observers • More commonly known as inter-rater reliability • Ranges on a scale from 0 to 1
Score Interpretation • Two types of interpretations: criterion-referenced and norm-referenced • Criterion-referenced • You need to know the underlying scale (e.g., 0-100, 1-5, etc.) upon which the scores are based • The interpretation of the test score is made relative to this underlying scale • The scores indicted the students mastered about three-fourths of the objectives • The scores are interpreted relative to what the students know • The scores easily communicate some level of performance (e.g., good, bad, moderate, etc.)
Score Interpretation • Norm-referenced • You need to know the reference group (i.e., norming sample) against which the scores are being compared • The interpretation of test scores is made in relation to the scores of students in the norming group • John’s score put him in the 85th percentile • John’s score indicates he performed better than 85% of the students in the norming group • John’s score doesn’t tell us anything about what John knows in terms of content
Score Interpretation • A note of caution • Which of the following represents a criterion-referenced and norm-referenced interpretation? • The scores for the experimental group were significantly higher than those for the control group. • The scores for the experimental group indicated mastery of about 95% of the objectives, while those scores for the control group indicated only 65% mastery. • These are common examples from the literature you will be reading • Be careful about the first interpretation, as it only tells us which group is better. It does not tell us how well either group performed.