290 likes | 523 Views
IRT. IRT . Remember the basics of IRT Individuals have many indicators of some underlying thing that we want to measure. Unknowns Latent “ability” (or ideology, or sophistication…) of the person Latent “difficulty” of the choices Latent discrimination of the question. IRT.
E N D
IRT • Remember the basics of IRT • Individuals have many indicators of some underlying thing that we want to measure. • Unknowns • Latent “ability” (or ideology, or sophistication…) of the person • Latent “difficulty” of the choices • Latent discrimination of the question
IRT • Basic set up is a big logit • You could, if you want, write out the Bugs code to do this • Or you could rely on the programs others have written
IDEAL • Simon Jackman has written a nice package. • Part of the pscl library
Command ideal(object, codes = object$codes, dropList = list(codes = "notInLegis", lop = 0), d = 1, maxiter = 10000, thin = 100, burnin = 5000, impute = FALSE, normalize = FALSE, meanzero = normalize, priors = NULL, startvals = "eigen", store.item = FALSE, file = NULL, verbose=FALSE)
Ideal call the command • Object is an object of class rollcall • This is a specific kind of object • Binary data
Roll call rollcall(data, yea=1, nay=0, missing=NA, notInLegis=9, legis.names=NULL, vote.names=NULL, legis.data=NULL, vote.data=NULL, desc=NULL, source=NULL)
Roll call still • Data can be in two forms • Matrix with the rows corresponding to the subjects and columns the outcomes • List with an element named votes containing the matrix. • Yea. • Must be numeric • Code for a yea or success • Default is 1
Roll call • Nay • A lot like yea • Numeric • Value for no/fail • 0 is default • Missing • Numeric or NA. • Default is NA
Roll call • notInLegis • Numeric or NA • Code for not being in the legislature (test) when item is recorded • Legis.names • Vector of names of observations • Vote.names • Vector of names of votes/questions
Roll call • legis.data a matrix or data.frame of covariates for each subject (party, seniority) • Must be the same number of rows as data • Vote.data a matrix of covariates specific to the items • Desc • String Characters describing the data • Source • A string indicating where the data came from
Ideal • Codes • A list describing the types of voting decisions in the roll call matrix • Defaults to whatever is in the rollcall object • dropList • A list of voting decisions, legislators and or votes dropped from the analysis • d • The number of dimensions to estimate • Default is 1
Ideal • Maxiter • Maximum number of iterations • Thin • Numeric positive integer. • How many iterations between recordings in the MCMC • Burnin • Impute • How to deal with missing data
Normalize • Set it equal to mean zero and standard deviation of one • Priors • List of informative priors • Uninformative is NULL • Startvals • Starting values • Eigen or random • Don’t touch this.
Store.item • Logical • Store the discrimination parameters? • File • String • File to write the output to. • Verbose • How much output do you want
Example: senate data # LOAD SIMON'S PSCL PACKAGE library(pscl) ### LOAD DATA FROM THE 109TH SENATE data(s109) ### IDEAL COMMAND: # DATA 109TH SENATE # D INDICATES A SINGLE DIMENSION (PARTISAN CLEAVAGE) # NORMALIZE CONSTRAINS ESTIMATES TO MEAN 0, SD 1 FOR # IDENTIFICATION # STORE DISCRIMINATION PARAMETERS id1 <- ideal(s109, d=1, normalize=TRUE, store.item=TRUE, maxiter=500, burnin=100, thin=10, verbose=TRUE) summary(id1) plot.ideal(id1)
Note that the info about the senators was saved previously in the data. • It is part of the ideal package, so there is no need to import it.
data(s109) This command loads the data. This is a command for loading R data sets. Again, s109 is part of the package, so R knows where to find it. R code
id1 <- ideal(s109, d=1, normalize=TRUE, store.item=TRUE, maxiter=500, burnin=100, thin=10, verbose=TRUE) Ideal is the command, the first thing is the data d determines the number of dimensions in the IRT model Normalize sets mean to one standard deviation to one. store.item keeps the discrim parameter Maxiter, burnin, and thin are intuitive Verbose means more output
Number of Legislators: 101 Number of Votes: 477 Number of Dimensions: 1 Number of Iterations: 500 Thinned By: 10 Burn-in: 100 Ideal Points (Posterior Means), by Party Mean 2.5% 98% D: Dimension 1 -1.04 -1.47 -0.63 Indep: Dimension 1 -1.03 -1.03 -1.03 R: Dimension 1 0.87 0.20 1.34 Ideal Points, Dimension 1(sorted by posterior means): Mean Std.Dev. 2.5% 97.5% BOXER (D CA) -1.636 0.111 -1.817 -1.478 HARKIN (D IA) -1.468 0.070 -1.584 -1.367 KENNEDY (D MA) -1.448 0.054 -1.539 -1.340 CORZINE (D NJ) -1.406 0.075 -1.589 -1.301 LAUTENBERG (D NJ) -1.344 0.048 -1.420 -1.272 SARBANES (D MD) -1.329 0.052 -1.396 -1.216 KERRY (D MA) -1.328 0.045 -1.391 -1.250 MIKULSKI (D MD) -1.311 0.077 -1.433 -1.209 DURBIN (D IL) -1.296 0.060 -1.397 -1.168 REED (D RI) -1.257 0.062 -1.368 -1.151 CLINTON (D NY) -1.243 0.073 -1.379 -1.137 DAYTON (D MN) -1.238 0.043 -1.295 -1.126
CMS example • Underlying latent variable: attention to the campaign • Indicators: Have you heard enough about CANDIDATE to form an opinion about him? • Candidates: • Reagan, Glenn, Kennedy, Mondale, Cranston, Anderson, Hollings, Askew, Jackson, Bush, Baker, McGovern, Dole, Hart, Bentson, Bumpers, Cuomo
99% heard of Reagan, Kennedy, Mondale • <10% heard of Bentson, Bumpers, Cuomo
Step One: read in the data • Step Two: Declare it to be roll call data • cms3<-rollcall(cms2, yea=1, nay=0, legis.names=names) • Step Three: estimate! • irt1<-ideal(cms3, d=1, store.item=TRUE, maxiter=500, normalize=TRUE, burnin=100, thin=10, verbose=TRUE) • summary(irt1) • plot.ideal(irt1)
IRT • Note the scale on the Y axis—it is a proxy of time • Later in the campaign heard of more. • This is surprisingly easy.
Saving the terms • There are a lot of things saved by ideal • n is the number of legislators • m is the number of roll calls • d is the number of dimensions • x is the sample ideal point matrix • beta is the matrix of discrimination parameters • xbar is the mean of the MCMC sample for each legislators • betabar is the mean of the MCMC sample for the discrimination parameters