190 likes | 305 Views
Chapter 5. Statistical Concepts: Creating New Scores to Interpret Test Data. Norm-referenced Vs. Criterion Referenced Testing. Norm-referencing Comparing each person’s score to the average score of peer or “norm” group Criterion-Referencing
E N D
Chapter 5 Statistical Concepts: Creating New Scores to Interpret Test Data
Norm-referenced Vs. Criterion Referenced Testing • Norm-referencing • Comparing each person’s score to the average score of peer or “norm” group • Criterion-Referencing • Compares test scores to a predetermined value or a set of criterion. • See Table 5.1, p. 83
Derived or Converted Scores • Ways of making sense out of normal curve • Types of derived scores: • percentiles • standard scores • developmental norms
Derived or Converted Scores: Percentiles • The percentage of people falling below the obtained percentile and range from 1 to 99, with 50 being the mean. • Do not to confuse percentile scores with the term “percentage correct.” • See Figure 5.1, p. 85.
Derived or Converted Scores: Standard Scores • Z scores • T scores • Deviation IQs • Stanines • Sten scores • Normal Curve Equivalents (NCE) Scores • College and graduate school entrance exam scores (e.g., SATs, GREs, and ACTs), and • Publisher type scores
Derived or Converted Scores: Developmental Norms • Grade equivalents • Age norms
Determining and Understanding Standard Scores • z scores • A standard score that helps us understand where an individual falls on the normal curve. • Practically speaking, z scores run from -4.0 to plus 4.0 (rarely get a z score that’s higher or lower) • To find a z score: (X – M)/SD
Converting z Scores to Other Types of Scores • Percentiles: Find your z score, then either approximate percentile or look at Appendix D for conversion of z scores to percentiles. • Converting to Standard Scores • 1. Get your z score • 2. Plug your z score into the following conversion formula • z-score (SD of desired score) + mean of desired score
Converting z Scores to Other Types of Standard Scores • Use your conversion formula by plugging in the means and standard deviations for each of the respective standard scores that follow: • T scores (M = 50, SD = 10) used for personality tests mostly. • DIQ scores (M = 100, SD = 15). Mostly used for tests of intelligence. • Stanines (M = 5, SD = 2, round or to nearest whole number). Mostly used for achievement testing.
Converting z Scores to Other Types of Standard Scores • Sten scores (M = 5.5, SD = 2, round to nearest whole number). Used with personality inventories and questionnaires. • NCEs (M of 50, SD of 21.06). Like percentiles in that they basically range from 0 – 100 but evenly distributed. (Percentiles, bunch up around mean). Used for educational tests usually.
Converting z Scores to Standard Scores (Cont’d) • SATs/GREs (M = 500, SD = 100). • ACTs (M = 21, SD = 5) • Publisher Type Scores: Mean and standard deviations arbitrarily set by publisher.
Developmental Norms: Age Comparisons • When you are being compared to others at your same age. • Often done for physical attributes: • My 10 year old weighs 78 lbs, what percentile is she as compared to others her age • Can use z scores to determine: • E.g., [78 – 80 (mean)]/8 (SD) = -.25 z or a percentile of about 40 (p = 40).
Developmental Norms: Grade Equivalents • Grade Equivalents: Compares the child’s score to his or her grade level. • E.g., Child in 5.6 grade and gets at mean compared to peer group. GE = 5.6 • GE over 5.6 means that he or she is doing better than his or her peer group • GE below 5.6 means that he or she is doing worse than his or her peer group • Usually, not a statement about how much better or how much worse (a child who is in 5.6 and gets GE of 7.5 is not necessarily at 7.5 grade level).
Rule Number 3:z Scores Are Golden • z scores help us see where an individual’s raw score falls on a normal curve and are helpful for converting a raw score to other kinds of derived scores. That is why we like to keep in mind that z scores are golden and can often be used to help us understand the meaning of scores.
Standard Error of Measurement • Based on reliability of test • Tells you how much error there is in the test and ultimately how much any individual’s score might fluctuate do to this error. • Formula = SEM = SD 1 – r • Multiply the standard deviation of the score you are using (e.g., for T score, SD = 10, for DIQ SD = 15) times the square root of 1 minus the reliability of the test. • Work practice problem, Box 5.5, p. 98.
Rule Number 4: Don’t Mix Apples and Oranges • As you practice various formulas in class, it’s easy to use the wrong raw score, mean or standard deviation. • In determining the SEM for Latisha (p. 95), we used Latisha’s DIQ score of 120 and figured out the SEM using the DIQ standard deviation of 15. • However, if we had been asked to figure out the SEM of her raw score, we would use her raw score and the standard deviation of raw scores. • Whenever asked to figure out a problem, remember to use the correct set of numbers (don’t mix apples and oranges), otherwise your answer will be incorrect.
Scales of Measurement • In assessment, we measure characteristics in quantifiable terms: but “quantifiable” can be defined differently • E.g., gender is measured as either male or female • E.g. achievement can be measured on a scale that has a large range of scores • Four different scales of measurement help us define “quantifiable” • Type of scale you use helps to determine type of instrument that is used to measure what you are measuring.
Scales of Measurement • Four scales to identify how to measure a quality include: • 1. Nominal scale. Numbers are arbitrarily assigned to represent different categories. E.g., race might be: • = Asian • = Latino • = African American • = Caucasian, etc. • 2. Ordinal scale. Magnitude or rank order is implied. E.g., Rating scale: “The counseling I received was helpful in obtaining the goals I came in for.” • = Strongly Disagree • = Somewhat Agree • = Neutral • = Somewhat Agree • = Strongly Agree
Scales of Measurement • 3. Interval scale. Establishes equal distances between measurements but has no absolute zero reference point. • E.g., A 600 on the GRE is better than a 550 but not twice as better as a 300. • 4. Ratio scale. has a meaningful zero point and equal intervals. • If I weigh 200, I weigh twice as much as someone who weighs 100.