1 / 31

Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys

Aquatic Resource Surveys. Designs and Models for. DAMARS. R82-9096-01. Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys. Breda Munoz Virginia Lesser.

ike
Download Presentation

Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Aquatic Resource Surveys Designs and Models for DAMARS R82-9096-01 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser

  2. This presentation was supported under STAR Research Assistance Agreement No. CR82-9096-01 awarded by the U.S. Environmental Protection Agency to Oregon State University. It has not been formally reviewed by EPA. The views expressed in this presentation are solely those of authors and EPA does not endorse any products or commercial services mentioned in this presentation.

  3. Outline • Missing data in environmental surveys • Nonignorable missing data mechanism • Model-based approach for nonignorable missing data • Design-based estimation and nonignorable missing data • Illustration • Summary

  4. Missing Data in Environmental Surveys • Researchers in environmental studies must obtain access to selected sites to gather field data • Denial of access: • common problem in environmental surveys • unit non-response • affects the results of data analysis

  5. Result 1995 1996 Private Landowners Agreed to access 43% 40% Refused access 36% 37% Undeliverable 2% 2% Not returned/no contact 16% 14% Public Land 3% 7% Total 100% 100% Response Disposition 1995/1996 EMAP North Dakota Prairie Wetlands Studies(Lesser, 2001)

  6. Introduction • (Boward et.al.,1999) The 1995-1997 Maryland Biological Stream Survey Results: overall denial access rate of 10%. • ODFW habitat surveys overall rate of access denial (Flitcroft et.al., 2002): • 1998: 10.0% • 1999: 6.0% • 2000: 12.5%

  7. Assumptions • A probability sampling design to collect outcomes of a spatial random process Y • is a collection of sampling sites selected using the probability sampling design. • auxiliary variables

  8. Missing Mechanism: Missing Completely at Random (MCAR) X1 Y R X2 Smith, Skinner and Clark (1999), Rubin and Little (2002)

  9. Missing Mechanism: Missing at Random (MAR) X1 Y R X2 Smith, Skinner and Clark (1999), Rubin and Little (2002)

  10. Missing Mechanism: Nonignorable X1 Y R X2 Smith, Skinner and Clark (1999), Rubin and Little (2002)

  11. Model-based Approach • Under a nonignorable mechanism: we model the joint probability of the data and the missing mechanism indicator (“response” indicator) : • R(si) ~ Bernoulli(pi), Missing Mechanism model Data model covariates

  12. Model-assisted estimation and nonignorable missing data • Assume the parameter of interest:Total of the response Y R

  13. Model-assisted estimation and nonignorable missing data • Continuous form of the Horvitz-Thompson estimator for the total (Cordy, 1993): • Let be a collection of fixed values

  14. Model-assisted estimation (cont.) • Sample size n: observed, n-n* missing nonignorable missing

  15. Model-assisted estimation (cont.) denotes the

  16. Model-assisted estimation (cont.) • Likelihood:

  17. Model-assisted estimation (cont.) • Reparameterize model parameters (Baker and Laird (1988)): Expected cell counts

  18. Model-assisted estimation (cont.) • Use EM algorithm to estimate expected counts of missing cells, Mij. • E-step:

  19. Model-assisted estimation (cont.) • M-step: iterative proportional fitting (IPF) (Bishop et.al., 1975) • Algorithm based on fit of marginal totals. • EM algorithm always converges to a solution when using IPF in the M-step (Baker and Laird, 1988)

  20. Model-assisted estimation (cont.) • Possible estimators for the total of Y: • Cell adjustment: adjustment weight (Little and Rubin, 2002)

  21. Model-assisted estimation (cont.) • Column adjustment:

  22. Model-assisted estimation (cont.) • Row adjustment:

  23. Model-assisted estimation (cont.) • Variance estimators obtained using bootstrap • (Efron, 1994) Bootstrap produces asymptotically valid variance.

  24. Illustration • We simulate a continuous multivariate normal spatial random process for y • Population: John Day Middle Fork stream reaches • 143 stream reaches divided in survey segments (~1 mile) • 6536 survey segments • Area of 785 mi2

  25. Illustration • The population of stream reaches was stratified in 6 strata based on the number of survey segments: “<10 ” “10-20” “20-30” “30-50” “50-100” “>100” • Nonignorable missing data was generated as: • Missing rates of 15%, 30% and 50% were created.

  26. Population Summary

  27. Illustration • Sample size n = 100 • Allocation proportional to number of survey segments on each strata • Q1 = first sample quantile

  28. Modified Bootstrap • We draw 1000 random samples of size 100 from the observed sample: • Independently across strata • Maintain proportional allocation • Maintain the row totals by the auxiliary variable • For each of the 1000 samples, we estimate • We obtain a standard error and MSE for each estimate • We repeat this process 1000 times

  29. Summary

More Related