700 likes | 904 Views
Sample analysis using the ICCS data An application of HLM. Daniel Caro November 25. Purpose. Illustrate the use of hierarchical linear models (HLM) with ICCS 2009 data through the evaluation of specific hypotheses. Table of contents. HLM theory Applied research example
E N D
Sample analysis using the ICCS data An application of HLM Daniel Caro November 25
Purpose • Illustrate the use of hierarchical linear models (HLM) with ICCS 2009 data through the evaluation of specific hypotheses
Table of contents • HLM theory • Applied research example • HLM data importing/estimation settings • Hypothesis testing
Data structure • Often participants of studies are nested within specific contexts • Patients treated in hospitals • Firms operate within countries • Families live in neighborhoods • Students learn in classes within schools • Data stemming from such research designs have a multilevel or hierarchical structure
Implications of research design • Observations are not independent within classes/schools • Students within schools tend to share similar characteristics (e.g., socioeconomic background and instructional setting) • Traditional linear regression (OLS) assumes: • Correlation (ei,ej)=0, i.e., the ≠ between observed and predicted Y are uncorrelated • Ignoring dependence of observations may lead to wrong conclusions
Intra-class correlation coefficient • The intra-class correlation coefficient (ICC) measures the degree of data dependence • It is equal to the proportion of the variance between schools, i.e., ICC = s2b / (s2b + s2w) • where s2b is the variance between schools and s2w the variance within schools or between students • If ICC = 0, responses of students within schools are uncorrelated • Si ICC= 1, responses within schools are identical
Effective sample size • A higher ICC value indicates greater dependence among observations within schools • Effective sample size is smaller than observed sample size • Effective n= mk / (1 + ICC*(m-1)) • where n=sample size, m= number of students per schools and k= number of schools • If ICC=1, effective n is equal to the # of schools (k) • If ICC=0, effective n is equal to the observed n (i.e., mk) • In general, effective n lies between k and mk
Limitations of OLS • OLS neglects ICC and considers standard errors based on observed n • But effective n is smaller than observed n when observations are correlated • Standard error is inversely proportional to n • Thus, OLS tends to underestimate the standard error • Underestimated standard errors can lead to incorrect significance tests and inferences • The JRR method produces correct standard errors under a multilevel research design
Hierarchical linear models • Additionally, hierarchical linear models distinguish effects between and within clusters/schools • For example, they enable evaluating • The effect of SES on student achievement within schools and between schools • The effect of school location (urban/rural) on the average achievement between schools
Hierarchical linear models • Account explicitly for the multilevel nature of the data with the introduction of random effects • Consider ICC for calculation of standard errors, tests, and p-values • Decompose variance within and between schools • Student level variables explain variance within schools or between students • School level variables explain variance between schools • A single R-squared cannot be reported • Instead, there is one for each level
Hierarchical linear models • Estimate regressions within schools • Provide estimates of the intercept and coefficients (e.g., gender gap, SES effect) for each school • Level 1 (students) coefficients may depend on level 2 (schools) characteristics as if they were dependent variables • For example, the gender gap at the student level (i.e., gender coefficient) may vary between classes for the gender of the class teacher at level 2
Table of contents • HLM theory • Applied research example • HLM data importing/estimation settings • Hypothesis testing
Research goal • Evaluate 10 hypotheses related to the attitudes of students towards equal rights for immigrants • The literature underscores the importance of: • Family SES, participation in diverse networks, intergroup discussion about civic issues, gender, social dominance orientation, civic knowledge, religion beliefs, the school location (urban/rural), the school climate • References in ‘C:\ICCS2009\HLM training\References.pdf’ • For each hypothesis • Theory and independent variables
Related data and variables • Selected country • England • The analysis is restricted to international scales/variables • A description of the dependent and independent variables, their type, coding scheme, and source is in • C:\ICCS2009\HLM training\List of variables.pdf • The student (england1.sav) and class level (england2.sav) datasets are in • C:\ICCS2009\HLM training\Data
Data structure • Students (level 1 units) are nested in classes (level 2 units) • The ICCS sample design yields an optimal sample of students within classes, and not optimal sample of students within schools • Usually one class was selected within each school, rather than students across different grades
NOTE • This is a didactic example only. You will not be able to readily repeat this analysis during the presentation
Table of contents • HLM theory • Applied research example • HLM data importing/estimation settings • Hypothesis testing
HLM software • HLM estimates different type of hierarchical linear models • The applied example is for two-level models (student nested in classes) • Several steps are required to estimate a model: • Creating data specifications file (.mdmt) • Importing data to HLM (.mdm) • Deciding on settings (e.g., weights, plausible values) • Specifying model (.hlm) • Estimating model
Missing data • HLM accepts multiply imputed datasets • Multiple imputation (MI) procedure is performed in another software • Consult NORM, PAN, MICE in Stata and R, for example • Since missing data are normally not completely at random, it is recommended to conduct MI before model estimation • But for this example we will use available data, only • HLM offers two options at level 1 • Listwise deletion (making mdm): Sample is the same for all models • Pairwise deletion (running analysis): Sample depends on included variables • Missings at level 2 reduce substantially the sample size
Interpret and save Folder: ‘C:\ICCS2009\HLM training\Models\model0.txt ’ Class variance=12.14; Student variance=103.99 ICC=12.14/(12.14+103.99)=0.11 11% of differences occur between classes
Table of contents • HLM theory • Applied research example • HLM data importing/estimation settings • Hypothesis testing
Hypotheses • The SES Hypothesis • The Contact Hypothesis • The Intergroup Discussion Hypothesis • The Gender Hypothesis • The Social Dominance Orientation Hypothesis • The Learning Hypothesis • The Religion Belief Hypothesis • The National Identity Hypothesis • The Urban/Rural Differences Hypothesis • The School Climate Hypothesis
The SES Hypothesis • The SES hypothesis predicts more positive views of minorities among students of higher SES families than among students of lower SES families • Competition among low SESs • High SESs travel and confront culturally diverse realities • Independent variables • Parental education (HISCED) • Parental occupational status (HISEI)
Centering of Xs • The intercept is the expected value of Y when Xs are zero • E(Y(Xs=0))=E(β0j)+β1j*0+ β2j*0+…+ βkj*0 +E(rij) • Since E(rij) and E(uoj) are zero => g00=Y(Xs=0) • But sometimes zero is not in the range of Xs • If X is age, achievement score, etc. • Here, the intercept is not interpretable • By centering the Xs, the intercept can be interpreted as the expected value of Y at the centering value(s) of Xs
Centering of Xs • Two options at level 1 • Grand and group (class) mean centering • The type of centering depends on the research interest (Enders & Tofighi, 2007; Raudenbush & Bryk, 2002) • Group mean centering is appropriate for unadjusted or pure within and between school effects • Grand mean centering yields school effects adjusted for student characteristics and is preferable for contextual effects
The SES Hypothesis • The hypothesis is supported by the parental education data • Effect size? (see stats and model estimates) • For a 1 SD increment in HISCED, IMMRGHT increases in 0.67 (1.04*0.64), that is, about 6 percent (0.67/10.75) of a SD in IMMRGHT
The Contact Hypothesis • The contact hypothesis anticipates greater tolerance among students participating in diversified and extended social networks (Allport, 1954; Cote & Erikson, 2009) • Independent variables • Students' civic participation in the wider community (PARTCOM) • Students' civic participation at school (PARTSCHL) • Control for SES • Higher SES have more diversified social networks (Erickson, 2004) and are more active in voluntary associations (Curtis & Grabb, 1992)
The Contact Hypothesis • The hypothesis holds in England • Both students' civic participation in the wider community (PARTCOM) and students' civic participation at school (PARTSCHL) are positively related to the attitudes toward immigrants • For a 1 SD increment in the independent variables, the associated positive change in IMMRGHT amounts to • 7 percent of SD in IMMRGHT for PARTCOM • 11 percent of SD in IMMRGHT for PARTSCHL
The Intergroup Discussion Hypothesis • The intergroup discussion hypothesis posits that more positive attitudes toward minorities develop from dialogue on social and civic issues inside and outside the school (Dessel, 2010a) • Independent variables • Students' discussion of political and social issues outside of school (POLDISC) • Student perceptions of openness in classroom discussions (OPDISC) • Control variables • Parental education (HISCED)
The Intergroup Discussion Hypothesis • The hypothesis is validated by the data • Both students' discussion of political and social issues outside of school (POLDISC) and student perceptions of openness in classroom discussions (OPDISC) are positively related to IMMRGHT • For a 1 SD increment in the independent variables, the associated positive change in IMMRGHT amounts to • 9 percent of SD in IMMRGHT for POLDISC • 18 percent of SD in IMMRGHT for OPDISC
The Gender Hypothesis • The gender hypothesis predicts greater tolerance among girls than boys. Women tend to be more liberal, nurturing and social than men and are also expected to be more tolerant (Cote & Erikson, 2009; Gidengil, Blais, Nadeau, & Nevitte, 2003) • Independent variable • The student’s sex (GIRL)
The Gender Hypothesis • The gender hypothesis holds in England • Differences between girls and boys amount to 2.24 score points in the IMMRGHT scale, that is, 21 percent of a SD in IMMRGHT
The Social Dominance Orientation Hypothesis • The social dominance orientation (SDO) hypothesis states that gender differences are partly explained by a differences in support for social inequality (Mata, Ghavami, & Wittig, 2010). • Independent variables • Female (GIRL) • Students' support for democratic values (DEMVAL) • Students' attitudes towards gender equality (GENEQL) • Students' attitudes towards equal rights for all ethnic/racial groups (ETHRGHT)