1 / 81

Second Science Assessment Webinar

Second Science Assessment Webinar. Performance Assessments in Science at the State Level October 2012. Webinar Agenda. Introduction WestEd and 3 states will give a 10 minute overview of what they are doing in performance assessments at the state level in science:

kaveri
Download Presentation

Second Science Assessment Webinar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Second Science Assessment Webinar Performance Assessments in Science at the State Level October 2012

  2. Webinar Agenda • Introduction • WestEd and 3 states will give a 10 minute overview of what they are doing in performance assessments at the state level in science: • WestEd (Edys S. Quellmalz and Matt Silberglitt) • Ohio (Lauren V. Monowar-Jones) • Vermont (Gail Hall) • Connecticut (Liz Buttner) • Open discussion

  3. WestEd

  4. Performance Assessments for Science Learning Edys S. Quellmalz and Matt Silberglitt WestEd Presented to the Council of State Science Supervisors October 24, 2012

  5. Goals • Needs for performance assessment • Limitations of current science assessments • Advantages of technology for science assessment • Types of performance assessment • Design principles for the assessment of iSTEM learning outcomes • Promising innovative approaches • Needed research and development

  6. Performance Assessment Features • Assessment targets specified are difficult or impossible to measure well with conventional item formats • Students construct responses, solutions, or products • Tasks represent significant, recurring, realistic problems • Criteria for evaluating performances are specified and communicated to examinees • Performances represent both science and engineering practices in progress as well as culminating solutions or products

  7. Limitations of Current Assessments • Emphasis on disconnected, declarative knowledge • Neglect of integrated, knowledge structures in fundamental science systems • Emphasis on procedural algorithms and skills • Neglect of strategic inquiry practices in authentic problems

  8. Relevance to Current Assessment Programs • Need innovative, technology-enhanced assessments that align with new frameworks that focus on • fewer, deeper, more integrated core knowledge targets • E.g., the new Framework for Science Education and next generation national science standards • Models as structures for understanding and studying science systems (model-based learning) • Science practices for using knowledge and inquiry in significant, recurring authentic tasks

  9. Relevance to Current Assessment Programs Need innovative, technology-enhanced assessments that • Target 21st century skills within STEM • Use technology to engage students in use of “tools of the trade” • Provide evidence supporting technology-enhanced performance assessment • For summative and formative purposes • Called for in the assessment consortia

  10. Advantages of Technology forScience Assessment • Present authentic, rich, dynamic environments • Support access to collections of information, expertise • Present phenomena difficult or impossible to observe and manipulate in classrooms • Represent temporal, causal, dynamic relationships “in action” • Allow multiple representations of stimuli and their simultaneous interactions (e.g., data generated during a process) • Allow overlays of representations, symbols • Allow student manipulations/investigations, multiple trials • Allow student control of pacing, replay, reiterate • Capture student responses during research, design, problem solving • Allow use of simulations of a range of tools (internet, productivity, domain-based)

  11. Advantages of Technology for Science Assessment • On-line search for assessments aligned with standards • Digital collections of assessments • Access to innovative technology-based prototypes and collections • Tools to support on-line assessment delivery and scoring • On-line guidelines for scoring by teachers and students • Online guidelines for interpretation of scores and implications for instruction • On-line professional development on assessment literacy

  12. Cognitively-Principled Assessment Design • Learning science research (E.g., How People Learn • Measurement theory and research on measuring learning (E.g. Knowing What Students Know) • Assessment argument linking claims of learning, to evidence of learning, to tasks eliciting the evidence

  13. Evidence-Centered Design Student Model Evidence Model Task Model What complex of knowledge, skills, or other attributes should be assessed? What behaviors or performances should reveal the relevant knowledge and skills described in the student model? Messick, 1993 What tasks or situations should elicit the behaviors or performances described in the evidence model? Mislevy, Almond, & Lucas, 2004

  14. Science Constructs (Student Model-Assessment Targets) • From national frameworks and standards for science, mathematics, engineering, technology • Cross-cutting concepts-E.g., Systems and System Models (Next Generation Science Standards) • Cross-cutting practices-E.g., problem solving, communication, collaboration (NAEP Framework for Technology and Engineering Literacy)

  15. Task Models • Integrated applications to natural and designed world • Applied, significant, recurring problems in the natural and designed world • Scenario-based tasks building across a problem solving/inquiry/design sequence

  16. Evidence Model • What evidence is collected • Explicit responses • Logged processes for technology delivered • How the evidence is evaluated and summarized • Scoring • Rubrics • How the evidence is reported for intended purposes and users

  17. Limitations of Performance Assessments • Less available information/documentation of measures-descriptive or technical quality • Lack of attention to alignment of outcome measures to science standards • Few descriptions of coverage/balance • Outcome measures tend to emphasize content, declarative knowledge • Little attention to application of practices

  18. Limitations of Science Assessments • Practices –not measured well by static, conventional formats • Few measures during (processes, formative) vs. at end (summative) • Little measurement of collaboration and communication • Lack of deliberate design to measure for transfer of cross-cutting concepts and practices • Little attention to establishing/documenting technical quality

  19. Challenges for Designing iSTEM Assessments • Specification of desired learning outcomes • Need for focus, coherence of knowledge and processes and whether situated in a domain and/or in integrated problems • Coverage-What is the balance of assessment targets? • What is the balance and coherence of classroom curriculum-embedded for formative purposes and district and state summative tests?

  20. Challenges for Designing Science Assessments Need to tailor assessment design to assessment purpose-intended use of the data • Formative/summative • Embedded to monitor (use of feedback and coaching) and adjust vs. • Culminating to report proficiency status Duration, scope, time • More extended, spread over multiple classes/periods • Embedded vs. external Documentation of measures • Descriptions, technical quality (validity of interpretation, reliability)

  21. Promising New Assessment Designs Enabled byTechnology • Alignments • Access to resources and expertise • Network with collaborators, experts • Collections • Delivery • Entry of rubrics, ratings, work in progress, final artifacts • Scoring-auto and online training and scoring, moderated rating sessions • Reporting-customized to users

  22. Promising New Assessment Designs for Science Using Technology to Support Hands-On Projects and Assessments • Blended model of equipment and technology • Entry of rubric ratings, calibrated training sessions • Annotated postings of designs, prototypes, tests • Embedded tasks to test knowledge and skills during projects • Electronic science notebooks • Electronic portfolios • Juried exhibitions posted, streamed, archived

  23. Promising New Assessment Designs for Interactive Task Design Features Dynamic presentations of spatial, causal, temporal phenomena in a system • Multiple overlapping representations Interactivity • Supports iterative, active inquiry and design Multiple response formats • Reduce reliance on text Rapid, customized interaction, feedback, reporting

  24. Research on Learning in Science Simulations • Facilitate formation of organized mental models of system components, interactions, and emergent behaviors • Facilitate transfer • Facilitate use of systematic problem solving & inquiry • Situate in authentic, significant, recurring problems in the natural and designed world • Highly engaging

  25. NAEP 2014 Framework and Specifications for Technology and Engineering Literacy • SimScientists: Force and Motion-Fire Rescue • PISA: Reactor • http://www.nagb.org/publications/frameworks/tech2014-framework/ch_toc/index.html

  26. SimScientistsTest Effects of Pollution on Cells

  27. SimScientists Test Effects of Calories on Activity Level

  28. Research Needs • Analysis of extant assessments-large scale and classroom, formative and summative • Analyses of performance assessment opportunities • Review of promising exemplars • Formulation and testing of different purposes, designs, and evidence collection strategies • Pilot studies of performance assessment design models for established and new genre of technology-enhanced learning environments • Documenting technical quality with alternative psychometric methods

  29. Contact Information equellm@wested.org msilberg@wested.org http://simscientists.org

  30. Ohio

  31. The Ohio Performance Assessment Pilot Project Lauren V. Monowar-Jones, PhD Project Coordinator Ohio Performance Assessment Pilot Project Ohio Department of Education Office of Assessment Lauren.Monowar-Jones@education.ohio.gov A Look Into the Future of Ohio’s Science Assessments

  32. Always do what you are afraid to do.

  33. The Task Dyad Learning System • Learning Task • Curriculum embedded • Assessment Task

  34. Ohio’s Task Dyad Learning System

  35. The Dyad System

  36. PARCC-Developed Assessments • English language arts • Grades 3 – 8 and high school • Mathematics • Grades 3 – 8 and high school • Operational school year 2014-15 State-Developed Assessments Ohio’s Next Generation Assessments • Science • Grades 5, 8 and high school • Social Studies • Grades 4, 6 and high school • Operational school year 2014-15

  37. A Sneak Peek

  38. Pilot: Teachers’ Roles Coaches for students. Scorers. Developers. Reviewers.

  39. OPAPP ParticipantsTeachers • Cohort 1: Sep 2008- May 2012 • 15 LEAs • HS: ELA, Math, Science • Cohort 2: Sep 2011- Dec 2013 • 7 LEAs • HS: ELA, Math, Science, SS, Career Tech • Cohort 3: Jan 2012 - June 2014 • 6 LEAs • ES: ELA, Math, Science, SS • Cohort 4: Nov 2012 – December 2013 • 15 LEAs • HS: ELA, Math, Science, Social Studies Career Tech • Cohort 5: July 2013 – May 2014 • Recruiting in March • ES: ELA, Math, Science, SS

  40. OPAPP ParticipantsCoaches • Cohort 1: • 2 ELA, 3 Math, 2 Science • Cohort 2: • 2 ELA, 3 Math, 2 Science, 2 SS • Cohort 3: • Grade 3: 3 coaches, Grade 4: 3 coaches, Grade 5: 2 coaches • Cohort 4: • 4-5 online coaches • Cohort 5: • 4-5 online coaches

  41. OPAPP ParticipantsHigher Ed • Cohort 1: 3 • Cohort 2: up to 20 • Cohort 3: up to 15 • Cohort 4: none* • Cohort 5: none* • Purpose of HE involvement is • To influence HE teaching • To influence teacher preparation • To provide content expertise

  42. Lessons Learned • Task Writing: • It is hard to write to a non-native delivery system. • It is hard for assessment contractors to learn to write good curriculum. • It is hard to develop good rubrics for Learning Tasks. • It is hard to align Learning and Assessment Tasks well. • Online Delivery System: • Schools are not always “teched up” enough for this model. • School firewalls can be problematic for learning tasks. • Internal internet access may be more of a problem than previously thought.

  43. Lessons Learned • Teachers: • Not all teachers are ready to use technology in their classrooms or labs. • Professional Development needs to be low impact on time and high impact on practice. • Scoring/Reporting: • Need method for identifying student work for re-score (that does not put the state in the position of qualifying teachers to score). • Need more data and information about how to present results to teachers so they make sense (both to psychometricians and to teachers).

  44. Lessons Learned • Teachers: • Not all teachers are ready to use technology in their classrooms or labs. • Professional Development needs to be low impact on time and high impact on practice. • Scoring/Reporting: • Need method for identifying student work for re-score (that does not put the state in the position of qualifying teachers to score). • Need more data and information about how to present results to teachers so they make sense (both to psychometricians and to teachers).

  45. Lessons Learned • Teachers: • Not all teachers are ready to use technology in their classrooms or labs. • Professional Development needs to be low impact on time and high impact on practice. • Scoring/Reporting: • Need method for identifying student work for re-score (that does not put the state in the position of qualifying teachers to score). • Need more data and information about how to present results to teachers so they make sense (both to psychometricians and to teachers).

  46. Vermont

  47. Vermont State Science Assessment An Overview Gail Hall and Kathy Renfrew Science Assessment Coordinators

  48. Vermont’s Journey State Performance Assessments—since 2000 • Vermont PASS Assessment • Partnership for Assessment of Standards-based Science • NECAP Assessment • New England Common Assessment Program • Collaboration with RI and NH A Winding Trail…

  49. The Details.. • Spring Assessment—Grades 4, 8, 11 • Content Domains • Life Science… 24% • Physical Science… 24% • Earth/Space Science… 24% • Inquiry Task… 28% • Data from Inquiry Performance Task investigations are collected by student partners. Scored items are answered individually. Spring in Vermont

  50. NECAP Science Test Design Session 3: Grades 4 & 8 Estimated time needed: 75 minutes (Schedule 120 minutes) 7 or 8 Inquiry TaskQuestions 2-point Short Answer & 3-point Constructed Response

More Related