ONR Advanced Distributed Learning

ONR Advanced Distributed Learning Impact of Language Factors on the Reliability and Validity of Assessment for ELLs Jamal Abedi University of California, Los Angeles National Center for Research on Evaluation, Standards, and Student Testing (CRESST) July 18, 2003

Classical Test Theory: Reliability 2 s2X = s2T + s2E X: Observed ScoreT: True ScoreE: Error Score rXX’=s2T /s2X rXX’= 1- s2E /s2X Textbook examples of possible sources that contribute to the measurement error: RaterOccasionItemTest Form

Generalizability Theory:Partitioning Error Variance into Its Components 3 s2(Xpro) = s2p + s2r + s2o + s2pr + s2po + s2ro + s2pro,e p: Personr: Ratero: Occasion Are there any sources of measurement error that may specifically influence ELL performance?

Validity of Academic Achievement Measures 4 We will focus on construct and content validity approaches: A test’s construct validity is the degree to which it measures the theoretical construct or trait that it was designed to measure (Allen & Yen, 1979, p. 108). A test’s content validity involves the careful definition of the domain of behaviors to be measured by a test and the logical design of items to cover all the important areas of this domain (Allen & Yen, 1979, p. 96). Examples: A content-based achievement test has construct validity if it measures the content that it is supposed to measure. A content-based achievement test has content validity if the test content is representative of the content being measured.

Two major questions on the psychometric of academic achievement tests for ELLs: 5 Are there any sources of measurement error that may specifically influence ELL performance? Do achievement tests accurately measure ELLs’ content knowledge?

Study #9 Impact of students’ language background on content-based performance: analyses of extant data (Abedi & Leon, 1999). Analyses were performed on extant data, such as Stanford 9 and ITBS SAMPLE: Over 900,000 students from four different sites nationwide. Study #10 Examining ELL and non-ELL student performance differences and their relationship to background factors (Abedi, Leon, & Mirocha, 2001). Data were analyzed for the language impact on assessment and accommodations of ELL students. SAMPLE: Over 700,000 students from four different sites nationwide. Finding • The higher the level of language demand of the test items, the higher the performance gap between ELL and non-ELL students. • Large performance gap between ELL and non-ELL students on reading, science and math problem solving (about 15 NCE score points). • This performance gap was reduced to zero in math computation.

Normal Curve Equivalent Means and Standard Deviations for Students in Grades 10 and 11, Site 3 School District Reading Science Math MSDMSDMSDGrade 10 SD only 16.4 12.7 25.5 13.3 22.511.7 LEP only 24.0 16.4 32.9 15.3 36.8 16.0 LEP & SD 16.3 11.2 24.8 9.3 23.6 9.8 Non-LEP & SD 38.0 16.0 42.6 17.2 39.6 16.9 All students 36.0 16.9 41.3 17.5 38.5 17.0 Grade 11 SD Only 14.9 13.2 21.5 12.3 24.3 13.2 LEP Only 22.5 16.1 28.4 14.4 45.5 18.2 LEP & SD 15.5 12.7 26.1 20.1 25.1 13.0 Non-LEP & SD 38.4 18.3 39.6 18.8 45.2 21.1 All Students 36.2 19.0 38.2 18.9 44.0 21.2 Note. LEP = limited English proficient. SD = students with disabilities.

Disparity Index (DI) was an index of performance differences between LEP and non-LEP. SITE 3 Disparity Index (DI) Non-LEP/Non-SD Students Compared to LEP-Only Students Disparity Index (DI) Math Math Grade Reading Math Total Calculation Analytical 3 53.4 25.8 12.9 32.8 6 81.6 37.6 22.2 46.1 8 125.2 36.9 25.2 44.0

Issues and problems in classification of students with limited English proficiency

Findings The relationship between language proficiency test scores and LEP classification. Since LEP classification is based on students’ level of language proficiency and because LAS is a measure of language proficiency, one would expect to find a perfect correlation between LAS scores and LEP levels (LEP versus non-LEP). The results of analyses indicated a weak relationship between language proficiency test scores and language classification codes (LEP categories). • Correlation between LAS rating and LEP classification for Site 4

Correlation coefficients between LEP classification code and ITBS subscales for Site 1

Generalizability Theory:Language as an additional source of measurement error s2(Xprl) = s2p + s2r + s2l + s2pr + s2pl + s2rl + s2prl,e p: Personr: Raterl: Language Are there any sources of measurement error that may specifically influence ELL performance?

ONR Advanced Distributed Learning

ONR Advanced Distributed Learning

Presentation Transcript

ONR Overview

Advanced Electrical Power System -- an ONR Thrust

Distributed Databases – Advanced Concepts

Advanced Distributed Learning and Training Transformation

Distributed Learning

Acknowledgments: ONR NOPP program HFIP program ONR Marine Meteorology Program

Advanced Perceptron Learning

ONR Forum

onr 2

Online Reporting (ONR)

Advanced Topics in Distributed Systems

ONR Advanced Distributed Learning Linguistic Modification of Test Items Jamal Abedi

ONR Advanced Distributed Learning Language Factors in the Assessment of English Language Learners

The Advanced Distributed Learning Initiative and the Handle System

NATO/PfP Advanced Distributed Learning Programme

Advanced Integral Learning

Distributed Databases – Advanced Concepts

ONR Webinar

Advanced Distributed Management System Market

Distributed Learning

Advanced Communication Among Distributed Objects

Distributed Learning