1 / 42

Automated Scoring for Next Generation Assessments

Automated Scoring for Next Generation Assessments. Karen Lochbaum November 17, 2011. Next Generation Assessments: Desired Features. Align to Common Core State Standards Capture student performance and application of higher order skills Track growth toward college readiness

jemma
Download Presentation

Automated Scoring for Next Generation Assessments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated Scoring for Next Generation Assessments Karen Lochbaum November 17, 2011

  2. Next Generation Assessments: Desired Features • Align to Common Core State Standards • Capture student performance and application of higher order skills • Track growth toward college readiness • Shift to digital technologies • Immediate feedback to inform instruction Automated scoring is a key lever for implementing performance assessments reliably and affordably at scale.

  3. Benefits of Automated Scoring • Immediacy & Efficiency • Evaluate responses in seconds • Reduce score turnaround time • Give students and teachers instant feedback • Reduce costs • Accuracy • Trained based on the collective wisdom of many skilled human scorers • Consistency, Objectivity • Can detect off-topic, inappropriate and “odd” responses

  4. Common Core State StandardsKey Features • Reading: Text complexity and the growth of comprehension • Language: Conventions, effective use, and vocabulary • Writing: Diverse text types, responding to readings and research • Speaking and Listening: Flexible communication and collaboration in a variety of settings • Mathematics:Conceptual understanding and authentic problem solving

  5. Reading

  6. Text Complexity • ACT’s Reading Between the Lines Report • Too few students are prepared for post high school readings • Proficiency in understanding complex texts is key • How can we best measure the complexity of texts? • How can we track student progress?

  7. Pearson’s New Reading Maturity Metric • Traditional approaches: surface level features • sentence length, word length, word lists • New approach: Deep analysis of meaning and substance • Simulateshow individual words gradually develop their unique meanings • About 30% more accurate than traditional readability formulas • Identifies most difficult and most important words

  8. Demo Common Core Texts Appendix B Grade 9-10 Informational Texts Language Arts Wiesel, Elie. “Hope, Despair and Memory.” Nobel Lectures in Peace 1981–1990. Singapore: World Scientific, 1997. (1986)

  9. Writing & Language

  10. Writing Performance Tasks You advise Pat Williams, the president of DynaTech, a company that makes precision electronic instruments and navigation equipment. Sally Evans, a member of DynaTech’s sale force, recommends that DynaTech buy a small private plane (a SwiftAir 235) that she and other member of the sales force could use to visit customers. Pat was about to approve the purchase when there was an accident involving a SwiftAir 235. Document Library • Newspaper article about the accident • Federal Accident Report on in-flight breakups in single engine planes • Internal Correspondence (Pat’s email to you and Sally’s e-mail to Pat) • Charts relating to SwiftAir’s performance characteristics • Excerpt from magazine article comparing SwiftAir 235 to similar planes • Pictures and descriptions of Swiftair Models 180 and 235 Example from the Collegiate Learning Assessment

  11. Writing Tacit Leadership Knowledge Scenarios You are a new platoon leader who takes charge of your platoon when it returns from a lengthy combat deployment. All members of the platoon are war veterans, but you did not serve in the conflict. In addition, you failed to graduate from Ranger School. You are concerned about building credibility with your soldiers. What should you do?

  12. Writing Automated assessment of diagnostic skills National Board of Medical Examiners

  13. Science • Use the technical passage 'Green Ocean Machine' to answer the following. • The passage states that “the new green partner [alga] seems to provide Hatena with most of its energy needs.” • Describe the process that enables organisms to use energy from light to make food. In your description, be sure to include: • the specialized features needed to produce food • the substances needed to produce food • the substances produced during this process Example from the Maryland School Assessment

  14. Speaking & Listening waveform spectrum words segmentation

  15. Spoken Item TypesOral Reading Fluency Demonstration • Assessment • Oral reading rate • Accuracy • Expressiveness

  16. Passage 1 A boy named Tom was at the bus stop. He was waiting for the school bus. There was no one there, but him. The bus was late. Tom began to talk to himself. “Maybe the bus forgot me,” he said. Then, Tom heard a dog barking. He looked up and saw his dog Spot running down the road. Spot ran to Tom. He was so happy to see Tom that he jumped into Tom’s arms. Just then, Tom heard the bus coming. He didn’t have time to take Spot home. There was no time to think. Tom grabbed Spot and hid him under his coat. The bus pulled up to Tom’s bus stop. Tom got on the bus and went to the back. His friend, Jack, had saved a seat for him. Just as Tom sat down a little yelp came from under his coat. “What do you have under there, Tom?” Jack asked. “If I tell you, do you promise not to tell?” replied Tom. “You bet! I’m your best friend, aren’t I?” asked Jack. Tom told Jack what had happened. He asked his friend what he should do. Jack had an idea. “You can tell the teacher you have something very cool for show and tell. Then, you could call your mom and have her come and pick up Spot.” Tom decided that’s what he would do. His teacher was surprised. His mom was mad, but Spot was very happy.

  17. Describe a picture or graph • Assessment • Vocabulary • Language Use • Pronunciation • Fluency

  18. Hear sentences and repeat them “What are you going to do this weekend?” “It wasn’t bad, but it wasn’t good either.” “The dog was barking all night long.” • Assessment • Sentence mastery • Pronunciation • Fluency

  19. Sentence Builds after the movie ended… went home… we all… • Assessment • Sentence mastery • Pronunciation • Fluency

  20. Story Retelling Overall RETELL Comprehension Sentence Mastery Fluency Pronunciation

  21. Mathematics

  22. Progressive rubrics check for both conceptual understanding and ability to execute.

  23. Here, the student is asked to find an expression for the area in the figure.

  24. Highlighted feedback shows how MathQuery correlates the student’s response with elements of the problem.

  25. Automated Scoring Approach • Learn to score based on several hundred human scored responses • Trained on their collective wisdom • Measure the content and quality of responses by determining • The features that human scorers evaluate when scoring a response • How those features are weighed and combined to produce scores

  26. The Intelligent Essay Assessor– IEA

  27. Other Features of IEA • Uses non coachable measures • No counts of total words, syllables, characters, etc. • No trigger surface features: “thus”, “therefore” • Detects larding of big words • Knows when it doesn’t know • Detects off-topic or highly unusual essays, non-standard language constructions, too long, too short …

  28. Content Based Scoring • Use Latent Semantic Analysis (LSA) to capture the “meaning” of language • LSA knows that • Surgery is often performed by a team of doctors. • On many occasions, several physicians are involved in an operation. mean about the same thing even though they share no words. • Enables evaluating the content of what is written rather than just matching keywords

  29. Spoken Assessments waveform spectrum words segmentation

  30. Example: Native Speaker REPEAT: New York City is famous for its ethnic diversity. Pronunciation: 8.7 Fluency: 8.1 Accuracy: 0 word error

  31. Example: Learner REPEAT: New York City is famous for its ethnic diversity. Pronunciation: 5.9 Fluency: 3.3 Accuracy: 1 word error (insertion)

  32. Pronunciation Fluency Accuracy Performance Comparison 3.026 seconds Native speaker 5.502 seconds Learner

  33. Keys to Success Design forautomated scoringfrom the START!

  34. Keys to Success • Item Development • Clear specification of performance, skills, and assessment • Optimize for scoring effectiveness • Item Delivery • Input and capture of student response • Field Test and Human Scoring • Representative samples • Double scoring with resolution

  35. Keys to Success • Psychometrics • Automated scoring performance as part of field test item evaluation • Operational Scoring & Monitoring • Requirements vary with nature of assessment and acceptable performance criteria • Automated scoring in combination with human scoring

More Related