320 likes | 467 Views
Technical Assessment Development and Validation: Methods for Ensuring The Utility, Validity and Reliability of Technical Skill Assessment Systems. Session Outcomes. Build understanding of key criteria for technical skill assessments: Utility Validity Reliability
E N D
Technical Assessment Development and Validation: Methods for Ensuring The Utility, Validity and Reliability of Technical Skill Assessment Systems
Session Outcomes • Build understanding of key criteria for technical skill assessments: • Utility • Validity • Reliability • Understand a state-led process for developing a technical assessment system that meets such criteria.
Carl D. Perkins Career and Technical Education Act of 2006 • Each state established a performance accountability system with multiple measures of student learning, program completion, and transitions to further education, employment and the military • Perkins III allowed wide flexibility in how to measure “technical skill attainment” • Perkins IV requires a more focused assessment approach for technical skill attainment
Technical Skill Attainment -- Secondary • Sec 113 (b)(2)(A) • …”core indicators of performance…that are valid and reliable… measures of each of the following:” • “Student attainment of career and technical skill proficiencies, including student achievement on technical assessments, that are aligned with industry-recognized standards, if available and appropriate.”
Technical Skill Attainment -- Postsecondary • Sec 113 (b)(2)(B) • …”core indicators of performance…that are valid and reliable…measures of each of the following:” • “Student attainment of career and technical skill proficiencies, including student achievement on technical assessments, that are aligned with industry-recognized standards, if available and appropriate.” • “Student attainment of an industry-recognized credential, a certificate, or a degree.”
Critical Features … a) Measure what is important b) Useful and timely feedback to stakeholders c) Fair, consistent and accurate measures (e.g., reliable & valid assessments) Key Concepts Utility Validity Reliability
Utility:Something useful or designed for use Some Core Assumptions • We have to do this, so let’s do it in a way that is going to be maximally useful to our stakeholders. • Assessment systems should ultimately influence and reflect what is occurring in the educational setting(s). • Without buy-in, items 1 & 2 will never happen. • A systematic process for stakeholder involvement and communication must be explicitly planned and built-in.
Validity: To what extent does the assessment measure what it is supposed to measure? Commonly used methods… • Face and content validity • Construct validity (convergent, divergent, factor analytic techniques, etc.) • Criterion-related validity (concurrent validity, predictive validity)
Reliability refers to the stability or consistency of assessment results. Does the assessment measure yield consistent results across different raters, different periods of time, different samples of tasks, and so forth. Commonly used methods • Internal consistency reliability • Test-Retest reliability • Inter-rater reliability • Others: (equivalency/parallel content, expert- rater reliability, etc.)
Wyoming CTE Assessment Project Goals • Establish shared expectations as to what students should know and be able to do in Wyoming’s CTE programs • Develop a valid and reliable CTE assessment system • Ensure the system provides useful, timely and accurate feedback to teachers, administrators, students and employers
Options to fulfill Perkins IV are: Use Industry-Based Certifications or other standardized-assessments – AND/OR Develop valid and reliable assessments through a statewide collaborative process
Challenges and Considerations • Access to assessment data to improve classroom instruction • Dealing with the expense of buying IBC’s • Making sure the IBC’s match up to the course content • Making sure the IBC is valuable to employers and the job market • Getting data from externally administered exams • Deciding when to assess (end-of-program or course by course) • Assessments that are appropriate to various program structures and goals • Assessing CTE skills AND employability skills
Putting First Things First FIRST, decide WHAT to assess THEN decide HOW to assess
Source of Standards SCANS State standards SCCI K&S State standards Industry State standards defacto – texts and tests State standards SCCI K&S State standards Courtesy of Steve Klein & MPR Associates
Can One Assessment Measure It All? Program of Study Industry certification test State test from National Item Banks Commercial employability skills test No matter the approach, it is inherent that the program will include applied academic skills, employability skills, cluster- and pathway-level skills, and program/occupation skills. When considering assessments, consider if one assessment can adequately measure all those skills.
Setting up the Structure Assessment Project Advisory Group • 20-25 participants, CTE administrators, community college administrators, teachers from various clusters, state agency staff, • Provide general input on development process • Liaison to the education communities at secondary and postsecondary levels • Identify and prioritize clusters for development in remainder of project • Meet in-person and through webinars, 2-3 times per year
Setting up the Structure Business/Industry Advisory Group • Cross-section of business/industry representatives. • Should include representatives from each of the three initial clusters (Agriculture, Construction, Manufacturing) • Provide general input on development process from business/industry perspective • Liaison to the business communities across the state • Review, provide input, affirm content developed by Cluster/Pathway Work Team • Advise on raising value of CTE and the CTE assessment system within Wyoming business/industry. • Meet in-person and through webinars, 2-3 times per year
Setting up the Structure Cluster/Pathway Work Groups • 7-10 content experts in each of three clusters: Agriculture. Construction, Manufacturing • Provide input on the priority competencies to include in the assessment system • Assist in identifying the relative usefulness and applicability of existing assessments • Provide input on any state-developed assessments that are determined to be necessary • Kick-off briefings on March 7, 2008. • Work sessions March 16-20, 2008. • In-person and webinar follow-up sessions through June 2008. • Optional involvement in assessment pilot phase.
Identifying Competencies and Objectives • March 08 • Convene initial Cluster/Pathway Working Groups (CPWG). • Each CPWG identifies core competencies (technical, academic, employability) that need to be assessed in each Cluster/Pathway. • April-May 08 • Draft Competencies are completed by CPWG and posted online for review. • Other WY teachers and faculty invited to review and comment on Draft Competencies. • Cluster/Pathway Competencies finalized
Identifying Test Items and Assessment Options • May-June 08 • 5/12/08, Manufacturing and Arch/Construction CPWG’s meet to review sample test items for the Competencies • June 08 • Agriculture/Natural Resources CPWG meets to review sample test items for competencies • Consultant team gathers information on assessment resources (NOCTI, SkillsUSA, industry groups) and delivery system options • Consultant team completes feasibility report for assessment development phase.
Pilot Testing Assessments and Next Steps • Fall 2008 – Spring 2009 • Development and pilot testing of first phase assessments for initial clusters • Possibly begin to work with additional Cluster/Pathway Work Groups to identify Essential Core competencies for other areas
Overview of Assessment Development Process • Identify competencies and objectives. • Decide what to assess and how (e.g., develop an assessment blueprint). • Feasibility phases examine existing options • Pilot assessments and conduct necessary analyses to document technical quality of assessments. • Finalize assessments, delivery, key features the system must possess.
What is an Assessment Blueprint? An assessment blueprint helps us determine what should be covered in the assessment(s) as well as the number of test items that should be included in each category. You can also use it to help determine the total length of the test as well as the types of items to be included.
Example: NOCTI Experienced Worker Assessment Blueprint Areas covered in the Building Construction Occupations, Written Assessment: 29% Carpentry 17% Electrical 7% Plumbing 4% Math 6% Metal Work/Guttering 10% Painting & Decorating 8% Building Code & Safety 19% Masonry
One last example … A sample assessment blueprint from a test on human geography is provided below.: The construction of the blueprint is a useful process as it helps to ensure that items are produced that cover both the content of the program and the educational objectives. It can also allow the balance of 'worth' of individual items to be determined.
Developing an Assessment Blueprint • What is the relative emphasis you want to place on cluster level competencies versus pathway level competencies? • What is the relative emphasis you want to place on areas within the pathway? • Within a cluster/pathway, are there objectives that are a greater priority for you to measure than others? • What parameters do you wish to set for the total length and duration of the test? • What are your thoughts regarding the distribution and use of different types of assessment items across the areas? How many multiple choice, short-answer/constructed response, or performance tasks? (Note: Blooms Taxonomy, etc.)
Some factors to consider when examining potential existing assessments • Alignment: Do the items align or match up with the competencies/objectives we’ve identified as important? (e.g., Is the assessment measuring what we want it to measure? • Ease of use: Is it manageable for teachers and students (e.g., administration method, directions clear, requirements in terms of resources, etc.) • Administration Time • Cost • Flexibility • Fairness • Content in terms of assessment items (see next page)
Some characteristics of good assessment items … • Each item has a specific purpose and is designed to test a significant learning outcome. • Items are clear (avoid irrelevant material, also language should be clear and easy to understand or else you may be measuring student comprehension of English rather than the trait you wish to measure). • Contains plausible distractors • Do the items contained within the assessment employ multiple methods to provide a more complete picture of student knowledge and/or skills? • Do questions discriminate between the more able and less able students? Do they allow students to go well beyond the threshold requirements if they are able to?
Key features of the assessment system? • Timing of assessments • Types of scores produced • Reporting (ongoing documentation? Access?) • Order of presentation (items presented in a specific order, randomly, etc.) • If online, desired features? (security/access, timed access, etc.) • Other things you want to be sure the system has or is able to do??? As a teacher, what are the key features this CTE assessment system should have in order to make it really useful for you and your students?
For more information, contact: • Mariam Azin: mazin@presassociates.com www.presassociates.com • Hans Meeder: Hans@MeederConsulting.com www.MeederConsulting.com