1 / 16

Multiple Imputation using SAS

Multiple Imputation using SAS. Don Miller 812 Oswald Tower miller@pop.psu.edu 814-863-3155. Introduction. Missing values occur often in research: refused/don’t know, attrition, skip patterns…

penn
Download Presentation

Multiple Imputation using SAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Imputation using SAS Don Miller 812 Oswald Tower miller@pop.psu.edu 814-863-3155

  2. Introduction • Missing values occur often in research: refused/don’t know, attrition, skip patterns… • Dropping missing values may bias results (e.g. women and/or overweight tend to disclose their weight less often than others) • Attempts are made to impute the data (“fill in” missing values) • Single imputation (e.g. with the mean) is biased, doesn’t give measure of uncertainty

  3. Paris datasets • Open Windows Explorer (or My Computer) • Tools – Map Network Drive • Drive P: • Folder \\paris\sas_data • For help help@pop.psu.edu • Stat help stat-core@pop.psu.edu

  4. Data Setup

  5. Multiple Imputation Simple Procedure 1. Impute using PROC MI 2. Round off, if you want plausible values (caution: this will bias your results) 3. Do analysis: PROC REG, LOGISTIC, etc. using by _imputation_; in the procedure 4. Combine results using PROC MIANALYZE • For categorical variables: Construct binary dummy variables, throwing out reference category (e.g. race: 1=“white”, 2=“black”, 3=“other” becomes black, other variables)

  6. PROC MI • Typical syntax: proc mi data=bmx out=impdat seed=33155; var bmxbmi bmxht bmxwt bmxarmc bmxarml; run; • data= 1 copy of data with missing values • out= 5 copies of data with imputed values (will be different across copies) • seed= random seed, you can keep same to reconstruct your results • var Variables with missing values you need imputed, in model, and those that may be helpful with imputation

  7. PROC MI Sample Output

  8. PROC MI Sample Output

  9. PROC MI Options • nimpute=5# imputations, default=5 0 gives missing patterns • minimum=0 0 0 0 set min & max, sometimes maximum=1 1 1 90doesn’t converge as well • round=1 1 1 0.01 round off option • alpha=0.05 confidence limits • mu0=0.5 0.5 0.5 25 t test null hypothesis μ=μ0

  10. PROC MI Statements • em maxiter=200 out=emdata; EM algorithm, MLE of missing data • freq fweight; weighs observations by frequency weight • mcmc (options); modify imputation method • class sex race; specify categorical variables (don’t need dummies) (new / experimental)

  11. Output dataset

  12. Regression • Fit your model as if data had no missing values, using by _imputation_; • proc reg data=impdat outest=parmcov covout; model bmxbmi=bmxht bmxwt bmxarmc bmxarml; by _imputation_; run; • You’ll get nimpute (usually 5) sets of output • Estimates, covariances, errors will be combined in MIANALYZE (R² is just mean) • Need to generate parameter estimates and covariance data set (varies by procedure)

  13. Parameter Est. & Covariance Matrix • proc logistic data=impdat descending; model bmxbmi=bmxht bmxwt bmxarmc bmxarml /covb; by _imputation_; ods output ParameterEstimates=parmsdat CovB=covbdat; run; • proc mixed data=impdat; model bmxbmi=bmxht bmxwt bmxarmc bmxarml /solution covb; by _imputation_; ods output covparms=parmcov; run;

  14. Parameter Est. & Covariance Matrix • proc genmod data=impdat; model bmxbmi=bmxht bmxwt bmxarmc bmxarml /covb; by _imputation_; ods output ParameterEstimates=parmsdat CovB=covbdat; run; • proc glm data=impdat; model bmxbmi=bmxht bmxwt bmxarmc bmxarml /inverse; by _imputation_; ods output ParameterEstimates=parmsdat InvXPX=xpxidat; run;

  15. PROC MIANALYZE • Syntax depends on what procedure you used in previous step: • proc mianalyze data=parmcov; (or) proc mianalyze parms=parmsdat covb=covbdat; (or) proc mianalyze parms=parmsdat xpxi=xpxidat; (then type this:) modeleffects intercept bmxht bmxwt bmxarmc bmxarml; run; • Note the “var” statement is now “modeleffects” • Note that the dependent variable is omitted

  16. PROC MIANALYZE Output

More Related