E N D
1. 1 All You Ever Wanted to Know About Tests – and More!
2. 2 Outcomes
Identify types of tests
Understand types of scores
Be able to interpret test reports
3. 3
4. 4 Two Major Types of Tests
Norm-Referenced Test (NRT)
Criterion-Referenced Test (CRT)
5. 5 What is a Norm-Referenced Test (NRT)? A standardized assessment in which all students perform under the same conditions.
It compares the performance of a student or group of students to a national sample of students at the same grade and age, called the norm group.
6. 6 What is a Criterion-Referenced Test (CRT)? An assessment where a student's performance is compared to a specific learning objective or performance standard and not to the performance of other students.
It tells us how well students are performing on specific goals or content standards rather than just telling how their performance compares to a norm group of students nationally or locally.
7. 7 Summary NRT and CRT TerraNova is a NRT test that measures achievement in all subjects-reading, math, language, science, and social studies. It takes the place of the FCAT and is administered to children living on military bases outside the USA and the Pacific Rim , but under the NCLB.
The NMSQT is the National Merit Scholarship Qualifying Test. TerraNova is a NRT test that measures achievement in all subjects-reading, math, language, science, and social studies. It takes the place of the FCAT and is administered to children living on military bases outside the USA and the Pacific Rim , but under the NCLB.
The NMSQT is the National Merit Scholarship Qualifying Test.
8. 8 Raw Score (RS) The number of items a student answers correctly on a test.
John took a 20 item mathematics test (where each item was worth one point) and correctly answered 17 items.
His raw score for this assessment is 17.
Raw scores communicate nothing more that the total number of items answered correctly on a test.Raw scores communicate nothing more that the total number of items answered correctly on a test.
9. 9 Scale Score (SS)
Mathematically converted raw scores
that use a new scale to represent levels of achievement or ability.
For FCAT-SSS, a computer program is used to analyze student responses and to compute the scale score. It reports test results on the student’s entire test.
Scale Scores are Raw Scores that have been mathematically converted to account for the varying levels of difficulty among the questions.
In the case of the SSS portion of FCAT, students’ answers are analyzed by a computer in order to determine the scale score.
Because difficulty level is considered, the Scale Score is a more accurate reflection of a student’s true achievement level.
How does this type of score compare to a Raw Score? (Allow for discussion)
Scale Scores are Raw Scores that have been mathematically converted to account for the varying levels of difficulty among the questions.
In the case of the SSS portion of FCAT, students’ answers are analyzed by a computer in order to determine the scale score.
Because difficulty level is considered, the Scale Score is a more accurate reflection of a student’s true achievement level.
How does this type of score compare to a Raw Score? (Allow for discussion)
10. 10 Scale Score (SS) Higher scale scores indicate higher proficiency.
On a continuous, vertical scale across grade levels you can track a student’s progress from lower to upper grade levels on one scale. Growth in scale score units indicates growth in proficiency.
For FCAT-SSS, the Developmental Scale Score is used to determine a student’s annual progress from grade to grade.
11. 11 Student Growth? You can see that using the run graph student scaled scores are increasing over time.You can see that using the run graph student scaled scores are increasing over time.
12. 12 Grade 3-10 Achievement Level Cut Scores comparing the SS and the Developmental SS. Grade 3-10 Achievement Level Cut Scores comparing the SS and the Developmental SS.
13. 13 Percentage of students in the norm group whose scores fall at or below a given student’s score.
14. 14 Percentile Score and NCE Raw scores are mathematically converted to Percentile Scores, which can provide information to the standings of a student in relation to other students. Percentile scores are not normally distributed and therefore can not be used to compute means and standard deviations. Percentiles can be converted into NCE interval scale that have normal distributions. NCE can be subject to further statistical analysis.
Looking at the percentile scale, you can see more students score in the 40-60 percentile, while fewer students score at the end of the scale causing greater gaps. Raw scores are mathematically converted to Percentile Scores, which can provide information to the standings of a student in relation to other students. Percentile scores are not normally distributed and therefore can not be used to compute means and standard deviations. Percentiles can be converted into NCE interval scale that have normal distributions. NCE can be subject to further statistical analysis.
Looking at the percentile scale, you can see more students score in the 40-60 percentile, while fewer students score at the end of the scale causing greater gaps.
15. 15 Normal Curve Equivalent (NCE) Normal Curve Equivalents (NCEs) are norm-referenced scores ranging from 1 to 99 and have an average score of 50.
NCEs are an equal interval scale which allows for arithmetic calculations.
Changes in academic achievement are usually measured through NCE gains.
A student or group of students make an average year’s growth if they receive the same NCE score for two consecutive years. For example, if a child scores a NCE of 80 in first grade and an 80 in 2nd grade , that student has made a years growth.For example, if a child scores a NCE of 80 in first grade and an 80 in 2nd grade , that student has made a years growth.
16. 16 True Score A score entirely free of error.
Hypothetical value that can never be obtained by testing, since a test score always involves some measurement error.
A student’s "true" score may be thought of as the average of an infinite number of measurements from the same or exactly equivalent tests, assuming no practice effect or change in the student during the testing. A true score would occur in a perfect world where a student would receive the exact same score no matter how many times he took the test.A true score would occur in a perfect world where a student would receive the exact same score no matter how many times he took the test.
17. 17 Standard Error of Measurement (SEM) The amount a student’s score is expected to fluctuate around his or her true score.
SEM is frequently used to obtain an idea of the consistency of a student’s score or to set a band around a score.
For example, if a student scores 110 on a test and SEM=6, we would say we are 68% confident the student’s true score was between (110–1 SEM) and (110+1 SEM) or between 104 and 116.
18. 18
19. 19 These scores are of particular importance, because they let us know if a child is making the expected progress from year to year. Also, Learning Gain scores are used in the formula to calculate School Grades.These scores are of particular importance, because they let us know if a child is making the expected progress from year to year. Also, Learning Gain scores are used in the formula to calculate School Grades.
20. 20 FTE occurs in October and February. Students have to be present for both FTE’s. Third graders are not included since they do not have 2 consecutive years to compare.FTE occurs in October and February. Students have to be present for both FTE’s. Third graders are not included since they do not have 2 consecutive years to compare.
21. 21 Writing is not included because it is only done in 4th, 8th, and 10th grade.
Remember, Learning Gains can only occur if the student takes the same subject area tests for 2 consecutive years. Therefore, gains cannot be considered for Writing, because students do not take those exams at consecutive grade levels.Writing is not included because it is only done in 4th, 8th, and 10th grade.
Remember, Learning Gains can only occur if the student takes the same subject area tests for 2 consecutive years. Therefore, gains cannot be considered for Writing, because students do not take those exams at consecutive grade levels.
22. 22
23. 23 Retained students cannot qualify for Reason C, because they are taking the same grade level test 2 years in a row. The Developmental Scale Score increases are only valid on tests taken at different grade levels. Retained students cannot qualify for Reason C, because they are taking the same grade level test 2 years in a row. The Developmental Scale Score increases are only valid on tests taken at different grade levels.
24. 24 This table shows the cut-off points that are used to determine if a student has made adequate progress. In order to have made a year’s academic growth, the students must have scored at least 1 more point than the number listed for the subject area at the student’s grade level.
A 4th grader moving to 5th grade and has scored a level 2 both years, that student’s DDS score must be 166 plus “1” points higher in 5th grade to demonstrate adequate yearly progress or learning gains.
This table shows the cut-off points that are used to determine if a student has made adequate progress. In order to have made a year’s academic growth, the students must have scored at least 1 more point than the number listed for the subject area at the student’s grade level.
A 4th grader moving to 5th grade and has scored a level 2 both years, that student’s DDS score must be 166 plus “1” points higher in 5th grade to demonstrate adequate yearly progress or learning gains.
25. 25
26. 26 Data Display for FCAT Reading Results Student F gained 167 DSS points (the cut point was 92 + 1 = 93), but he/she cannot be included as making learning gains because he/she stayed at level 1. In fact, it is not even necessary to include the DSS points once you have seen that a retained student has stayed at a level one or a level 2.
Student G did not make learning gains. Even though he/she is still performing at a satisfactory level, you cannot decrease a level and still be considered as making gains.
A is yes because went up a level. B is yes/maintained level 4; C is yes/ increased by 92 points; D is yes/moved up level/ E is yes/maintained level 3; F is no/ need to move up a level; G is no/dropped. Student F gained 167 DSS points (the cut point was 92 + 1 = 93), but he/she cannot be included as making learning gains because he/she stayed at level 1. In fact, it is not even necessary to include the DSS points once you have seen that a retained student has stayed at a level one or a level 2.
Student G did not make learning gains. Even though he/she is still performing at a satisfactory level, you cannot decrease a level and still be considered as making gains.
A is yes because went up a level. B is yes/maintained level 4; C is yes/ increased by 92 points; D is yes/moved up level/ E is yes/maintained level 3; F is no/ need to move up a level; G is no/dropped.
27. 27 Teacher Learning Gains Based on Data Display Total Number of students who are included in learning gain calculations = 6 (all had a pre and post test score from FCAT)
Reason A = 2 students increased achievement levels. Which students? Students A and D.
Reason B = 2 students maintained satisfactory levels. Which students? Students B and E
Reason C = 1 student made more than one year’s growth. Which student? Student C was a level 1 student with a pre-test DSS score of 1598 and a post-test DSS score of 1743. Student C earned 145 DSS points and the cut point was 92 plus 1.
Student F did not make adequate progress because Student F was a retained student who maintained a level 1 for both the pre and post test; therefore DSS points do not apply – even though he/she grew in DSS points by 167 points.
Student G decrease a level and therefore did not make a learning gain in reading for that year.
Grade Reporting for Schools = Standard Curriculum Students (regular education students), ESE students with F (speech), L (gifted), or M (hospital homebound) exceptionalities and LEP students coded LY or LN and enrolled in the program for more than 2 years.
This chart represents standard curriculum students and would count 5 points toward classroom learning gains and the school’s grade.Total Number of students who are included in learning gain calculations = 6 (all had a pre and post test score from FCAT)
Reason A = 2 students increased achievement levels. Which students? Students A and D.
Reason B = 2 students maintained satisfactory levels. Which students? Students B and E
Reason C = 1 student made more than one year’s growth. Which student? Student C was a level 1 student with a pre-test DSS score of 1598 and a post-test DSS score of 1743. Student C earned 145 DSS points and the cut point was 92 plus 1.
Student F did not make adequate progress because Student F was a retained student who maintained a level 1 for both the pre and post test; therefore DSS points do not apply – even though he/she grew in DSS points by 167 points.
Student G decrease a level and therefore did not make a learning gain in reading for that year.
Grade Reporting for Schools = Standard Curriculum Students (regular education students), ESE students with F (speech), L (gifted), or M (hospital homebound) exceptionalities and LEP students coded LY or LN and enrolled in the program for more than 2 years.
This chart represents standard curriculum students and would count 5 points toward classroom learning gains and the school’s grade.
28. 28 Class Record Sheet for Learning Gains Here is a snapshot of a class record sheet for determining learning gains. You have a sample of this with your handouts.Here is a snapshot of a class record sheet for determining learning gains. You have a sample of this with your handouts.
29. 29
30. 30
31. 31 References/Acknowledgements Bernhardt, Victoria L. Data Analysis for Comprehensive School Improvement, Eye on Education, Inc., 1998.
Wahlstrom, D. (1999). Using Data To Improve Student Achievement, Virginia Beach, VA. Successline, Inc.
Harcourt Brace Educational Measurement (2001). Glossary Of Measurement Terms. Internet document. San Antonio, TX.
Ferrer,Wilma. Power Point development.
Council for Educational Change, Student Performancec Snapshot