Soyeon Ahn (ahnso@msu) Ph. D. Candidate Measurement and Quantitative Methods

Application of Model-Driven Meta-Analysis & Latent Variable Framework in Synthesizing Studies Using Diverse Measures Soyeon Ahn (ahnso@msu.edu) Ph. D. Candidate Measurement and Quantitative Methods Counseling, Educational Psychology and Special Education Michigan State University

What are Challenges/Difficulties? • Researchers focus on multiple factors • Researchers use diverse measures representing “the same” underlying construct. Thus, • Effects vary due to differences in measures • Studies use different statistical methods to link predictors and outcomes. • Interest is in obtaining the strength of relationship between underlying constructs

An Empirical Example Table 1. Example from Ahn, S. & Choi, J. (2004). Teachers’ subject matter knowledge as a teacher qualification: A synthesis of the quantitative literature on student’s mathematics achievement. Paper presented at the Annual Meetings of the American Educational Research Association, San Diego, CA.

Proposed Approach Goal Provide a method for handling a sparse data structures Acknowledge that different measures reflect the same underlying construct, or latent variable Approach Ideas from model-driven meta-analysis Structural Equation Modeling (SEM) with latent variables, and A method-of-moments estimation technique 4

Problem Summary In meta-analysis, the main goal is to determine the relationship between two constructs, . Across all studies included in a meta-analysis, p number ofpredictors for . q number of outcomes for . Relationship between and are quantified by various indices such as ρk and βk. y  q x p In the kth study 5

Data Structure Figure 1. A hypothetical meta-analysis with s studies

Structural Equation Modeling (SEM) with Latent Variables x1 y1  x2 y2 … … xp-1 yq-1 yq xp 7

Model Specification Let θ be the p+q dimensional column vector of indicators x and y. . Assembling the following four submatrices, The corresponding covariance matrix of θ is Measurement models: (2) (3) (4) (5) (6)

Estimation (7)

Estimation Applying the method of moments, The correlation between two constructs is (8) (9)

Information Needed For Estimating Two kinds of information are needed Population correlations between xs and ys Factor loadings Population correlations between xs and ys can be estimated, By computing sample-size weighted average of sample rs By computing z-transformed variance weighted average sample rs By pooling sample rs using Generalized Least Squares (GLS) method However, factor loadings are rarely reported so they should be estimated.

Extracting Unknowns Use of reliability information. From Bollen’s (1989) definition of reliability, Use of expert judgments Each content expert would be asked to rank order all indicators in terms of the validity of measures. Experts would also be asked to provide approximate values for validity coefficients of the indicators Set to 1

Simulation is measured by 3 xs and is measured by 3 ys. They are assumed to be standardized with mean 0 and standard deviation of 1. A hypothetical meta-analysis with s independent studies is generated. The kth study provides zero-order correlation based on sample size ns in each study. Sample correlation are generated from a multivariate normal distribution for vector of 3 xs and 3 ys. Mean and variance-covariance matrix for vector of indicators are predetermined. 13

Choice of Parameters in Simulation True correlation between two constructs: .5 & 0 Reliabilities for xs and ys: .9, .5, &2 # of studies included (s): 9 & 36 Within-study sample-size (ns): 30 Missing patterns of reported rs # of missing rs 14

Simulation Data Evaluation Two indices of the relationship between two constructs based on Equation 9 using Sample-size weighted average (ES1) z-transformed variance-weighted average (ES2) Evaluation Bias & MSE of estimators are obtained MANOVA on bias and MSE of estimators is performed. 15 15

Simulation Results 16 16

Practical Considerations Not all needed data are reported Not all relationships are reported Method assumes that model is correctly specified Method assumes that the included studies are all from the same population 19

Discussions The proposed approach might not be perfectly generalizable. However, there will be possibilities to expand it. Other possible future research could consider robustness of the proposed model.

Acknowledgement This research has been supported by grants from the U.S. Department of Education, Office ofEducation Research and Improvement (OERI #R305T010084) and National Science Foundation(NSF #REC-0335656).

Thank you very much!

Others • Expert judgment scale: Dissertation\IRB application\v2_Expert Judgments on Measure Validity_2007-12-28.pdf • Relationship between reliability and validity: • Point estimation

NOTES!!! • Starting from here, I have prepared for some slides in case that I get questions. • Contents • Simulation results for balanced case • Analysis of paired response

Simulation Overview is measured by 3 xs and is measured by 3 ys. They are assumed to be standardized with mean 0 and standard deviation of 1. A hypothetical meta-analysis with s independent studies is generated. The kth study provides zero-order correlation based on sample size ns in each study. Sample correlation are generated from a multivariate normal distribution for vector of 3 xs and 3 ys. Mean and variance-covariance matrix for vector of indicators are predetermined. 25

Simulation Data generation Sample correlation coefficients in kthstudy are generated from a multivariate normal distribution for vector of xs and ys. Vector of xs and ys is assumed to have mean vector of 0 and variance-covariance matrix, which are determined from factor loadings of variables, measurement errors, and the true relationship between two constructs. 26

Choice of Parameters in Simulation True correlation between two constructs: .5 & 0 Factor loadings for xs and ys All factor loadings are equally high All factor loadings are equally small Loadings vary from high to small Total 9 patterns (3 for factor loadings of xs × 3 for factor loadings of ys) # of studies included (s): 9 & 36 Within-study sample-size (ns): 30 27

Choice of Parameters in Simulation • Variances in reported rs • Balanced: Equal # of sample rs across 9 pairs of xs and ys • Unbalanced: Unequal # of rs across 9 pairs of xs and ys • Case 1: Select more pairs of xp and yq with high reliability • Case 2: Select more pairs of xp and yq with low reliability • Case 3: Select pairs of x1and y1, x2 andy2, x3 and y3 pairs of correlation. < Balanced, Unbalanced: Case1-2 > <Unbalanced: Case3> 28

Simulation 1,000 replications per combinations Condition1: Factor loadings do not vary → From 10 population variance-covariance matrices, 40 meta-analyses are generated (1ns ×2 S ×2 rs patterns). Condition2: Factor loadings do vary → From 8 population variance-covariance matrices, 80 meta-analyses are generated (1ns ×2 S ×4 rs patterns). Analysis Effects of factors that are manipulated in the simulation on the bias and MSEs of the estimators will be examined using MANOVA. 29

Extracting Unknowns Use of expert judgments Raters’ preferences for indicators will be coded based on Thurstone’s (1927) discrete utility model. Denote ti as the latent random variable associated with the validity for an indicator xi. Then, the individual rater’s perception for the validity of an indicator xiuxi becomes , where The latent comparative response as a linear function of latent variable uxi is , where A is a design matrix. 30

Extracting Unknowns An example for coding of the design matrix A for response as [A, B, C, D] is Assuming that txi is normally distributed, the mean and covariance of are and . Then, this ranked data can be understood in SEM framework. A B C D (A,B) (A,C) (A,D) (B,C) (B,D) (C,D) 31

Simulation Based on 1,000 replications of meta-analyses of 36 studies providing balanced correlations across pairs of xs and ys generated from 0 and .5 correlations between factors, Factor loadings of xs and ys are all .7 and = 30 32

Example for SEM with 4 alternatives 33

Simulation Based on 1,000 replications of meta-analyses of 36 studies providing balanced correlations across pairs of xs and ys generated from 0 and .5 correlations between factors, Factor loadings of xs and ys are all .7 and = 30 34

Result1 35 Figure 4. Simulation Result 1: Gamma = 0, Factor loadings of xs and ys are all .7, k = 36, = 30

Table 3. Descriptive statistics from simulation1 with Gamma = 0, Factor loadings of xs and ys are all .7, k = 36, = 30 36

37 Figure 5. Simulation result2: Gamma = 0.5, Factor loadings of xs and ys are all .7, k = 36, = 30

Table 4. Descriptive statistics from simulation1 with Gamma = 0.5, Factor loadings of xs and ys are all .7, k = 36, = 30 38

Soyeon Ahn (ahnso@msu) Ph. D. Candidate Measurement and Quantitative Methods