1.55k likes | 1.88k Views
POLS 606. Hierarchical Models. Intro . Who are you? Fields Substantive interests? I am Dave American politics Campaigns and elections. Logistics. Book—Gelman and Hill Snijders and Bosker is recommended. Chapter 2 of G&H is up to you to master G&H don’t rely on matrix algebra to teach
E N D
POLS 606 Hierarchical Models
Intro • Who are you? • Fields • Substantive interests? • I am Dave • American politics • Campaigns and elections
Logistics • Book—Gelman and Hill • Snijders and Bosker is recommended. • Chapter 2 of G&H is up to you to master • G&H don’t rely on matrix algebra to teach • Probability and simulation • Bayesian • R • Not easier or harder, just different.
Lectures • There will be a mix • Math (no PowerPoint) • There will be lectures using R • Will tend to alternate
R • Very powerful/flexible • Wave of the future • Need to understand stuff more • I have never used it before so we will learn it together
Grades • Homework • There will be a bunch and they will be a mix of practical and theory • Final • Questions I would write for the methods exam • Paper • Original research using HLM. Won’t be due until start of Fall Semester
What is a multilevel model? • Theory tells you that concepts at more than one level of aggregation are related • Usually thought of as geographic • Countries • States • Schools
What is a multilevel model? • Doesn’t have to be • Time • Experimental Condition • Institutions • Regime • Bureaucracies • Individuals (panel data)
Theory is key! • Two types of relationships • Random intercepts • Mean value of DV depends on aggregate unit • Random slopes • Effect of IV depends on aggregate unit • Can have both
So you have multilevel data • Choice 1: Aggregate • Combine data to the highest level of aggregation • Create “average” value of variables for each higher unit • Advantage • Easy! • Can easily weigh based on N • Straightforward
Aggregate • Disadvantages • Shifts meaning • Variables are macro level. Theory is (presumably) micro level. • Ecological fallacy • Classic example: Race and literacy (Robinson 1950)
Why different? • The key is the within region correlations
Both individual and ecological correlation depend on this, but in different ways • Individual depends on the internal cells of the region table • Weighted average of the corrs within the regions • Ecological on depends on the marginals • Only the marginals—no use of info in the cells.
So? • The things that go into the calculation of the ecological correlation do not tell us anything about what we are interested in.
Some Math • Assume • Total group of N Persons • Two variables x & x • N people divided in to m groups. • X & Y are % of x & y in each of the m groups
Three correlations • 1) Total individual correlation (r) • Correlation ignoring the grouping • 2) Ecological correlation (re) • Correlation between m pairs (weighted by n of m) • 3) within area Correlations (rw) • This is the weighted average of the correlations withi the m groups
Two correlation ratios • ηXA & ηYA • Measure the degree of clustering of X&Y by area • High ηXA means wide variation in X across regions
Math • Can write the relationship between the correlations as:
So? • re, then, is the weighted difference between individual correlation (thing we care about) and the average of m within area correlations where weights depend on clustering • Bias is not innocuous. Correlations are inflated. re will be large in magnitude than r. • Cannot infer across levels. Don’t do it. Won’t get away with it.
Disaggregate • You could ignore the higher level of aggregation and pretend everything is observed at the individual level • Advantage: Easy and generous • Disadvantage • You are lying. • Overstate power • Ignores correlations in the errors (not iid)
Dependence of errors • The problem is a function of the intraclass correlation • Simple model: • Y is the DV • μ = Grand Mean • Uj = Macro effects (errors) • Rij = Unit specific errors
Intraclass correlation • Errors all mean 0 • Expected value of macro units are: μ+Uj
Intraclass correlation • It is the proportion of the variance in Y explained by the macro effects • The key concept in HLM • It is the degree of similarity of observations within the groups
Intraclass Correlation • Note that it changes the error variance • OLS assumes that errors are uncorrelated across observations. • This says they aren’t. • Inflates power • Shrinks standard errors • Macro variables will try to account for this
Other solutions to multilevel data • Dummy variables • Doesn’t fix standard errors • Can’t specify interesting effects • Clustering • Fixes errors but not all other problems. • Ignores any systematic problems and the theories associated
Real Solution? HLM • Effects may vary (random slopes) • Use all of the info available and use it accurately • Better predictions • Account for structure in data • Efficiency • Accurate standard errors
How HLM? R! • R is a different kind of stats package • It is a language, not a program • Open source • http://cran.r-project.org/ • Problem is that it is not obviously user-friendly • No point and click front end embedded. • This can be addressed—R is adaptable
R • The computer staff tells me it is installed and they will install it on your office machines • Update by adding packages • Rcmdr – gui interface • arm • BRugs • R2WinBUGS • car • foreign • DAAG • Matrix and lme4 if not automatically
Packages • packages are commands or sets of programs to do things. • sessionInfo() tells you what are currently attached • library(“name”)
R • Need to load packages each time • The basic starting place for R is the command Prompt (>) • R will take anything you type at this line as a command and will respond • Load packages as library(arm) • Can (and probably should) write a script to do it all
R • If you just start typing stuff, R assumes you are telling it to evaluate a statement • 2+2 • pi • Any math equation. • R wants you to define “objects” • Everything needs to be an object
Commands • Basic format • “object”<-”command”(“definition”, option, option) • Example: open data • kidiq <- read.dta(file="c:/R/kidiq.dta") • reads the childrens IQ score data used in Chapter2 • “kidiq” names object kidiq • “<-” tells R that you are going to give it a definition • “read.dta” is the command to read data • “(file=“c:/R/kidiq,dta”)” tells it which data. Note / not \
R • Random things about objects • Case sensitive • Can (and often do) have . in the name • Will remember that they are there • Can see objects by ls() command • <- defines (equivalent to =) • Look at example 1
R working directory and workspace • Each session has a working directory • Where R looks for files • If launched from windows icon can define under properties (right click) • getwd() • ls() • q() • Save workspace image? • Saves all objects in a .RData file
Help! • help.start() • help(“name”)
Script • Can do line by line commands, but those are slow, temporary and error prone • better to use script editor: • File->new script • control+N • Can save and re-load
Missing data • R handles missing data • uses “NA” • Will read in data and convert just fine
Reading in data • kidiq <- read.dta(file="c:/R/kidiq.dta") • We have seen this before • Attaching: • in commands you need to tell R which data you are using (in fact, you can have lots of data sets loaded at once). • fit<-lm(kid.score~mom.hs, data=kidiq) • The command is attach • attach(kidiq) • fit<-lm(kid.score~mom.hs) • detach(kidiq)
Attach • R looks for things in a particular order • search() • Attach moves stuff around in the order • Order matters a lot—names of objects versus names of variables
Rcmdr • Handy front end, point and click • library(Rcmdr) • Has a script window • Nice, but don’t lean on it too hard • Thinks it is smarter than you
JGR (“Jaguar”) • Need to download and install it • Probably need computer staff for machines • Launches separate from R • Package manager is very nice • Runs separate version of R
Graphics • R has wonderful graphics if you can them to do what you want. • demo(graphics) • Starting point is plot() • plot(y~x) • plot(x, y) • graphs.R
Regression • You should know the basics and this should be review • Data being used: kidsiq (same as before)
Regression kid.score = a + b(mom.hs) + error lm(formula = kid.score ~ mom.hs) coef.est coef.se (Intercept) 77.55 2.06 mom.hs 11.77 2.32 n = 434, k = 2 residual sd = 19.85, R-Squared = 0.06 • Interpret • 78 = E(kid.score) if mom.hs=0 • 12 = Expected change in ks when mom.hs = 1
Regression • kid.score = a + b(mom.iq) + error lm(formula = kid.score ~ mom.iq) coef.est coef.se (Intercept) 25.80 5.92 mom.iq 0.61 0.06 n = 434, k = 2 residual sd = 18.27, R-Squared = 0.20 • Interpret • 26 = E(kid.score) when mom.iq=0 • 0.61 = expected change in ks for every iq point of mom
Regression • Both predictors lm(formula = kid.score ~ mom.hs + mom.iq) coef.est coef.se (Intercept) 25.73 5.88 mom.hs 5.95 2.21 mom.iq 0.56 0.06 n = 434, k = 3 residual sd = 18.14, R-Squared = 0.21 • Interpret?
Interactions • Remember, sometimes the effect of a variable is conditional on another variable • In stata you need to create the interaction, in R you can do it on the fly