1 / 18

Deana Crumbling, EPA/OSRTI/TIFSD crumbling.deana@epa 703-603-0643

Case Example: Using a Stratified Sampling Design & Field XRF to Reduce the 95% UCL for Residential Soil Lead. Deana Crumbling, EPA/OSRTI/TIFSD crumbling.deana@epa.gov 703-603-0643 2009 EPA Annual Quality Conference. What things increase the interval between the sample mean & UCL?.

Download Presentation

Deana Crumbling, EPA/OSRTI/TIFSD crumbling.deana@epa 703-603-0643

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Case Example: Using a Stratified Sampling Design & Field XRF to Reduce the 95% UCL for Residential Soil Lead Deana Crumbling, EPA/OSRTI/TIFSD crumbling.deana@epa.gov 703-603-0643 2009 EPA Annual Quality Conference

  2. What things increase the interval between the sample mean & UCL? • High variability in data set • Data set is from a non-normal or non-parametric distribution • Small number of physical samples in the statistical sample What creates high data variability? • True changes in matrix concentrations across space • Inadequate soil sample homogenization • Artifact caused small analytical subsample mass

  3. Variability as an artifact of small analytical sample mass As analytical sample volumes increase, data variability decreases & distribution goes from lognormal to normal (assumes whole sample is measured)

  4. } Physical manipulation of sample, increase volume (MIS) and/or sufficient replicate analyses Reduce the UCL by addressing: • Variability artifacts • Non-normal statistical distributions • Small number of physical samples in the statistical sample • High variability due to true variation By procedures that support: • Sample homogenization • Increased sample mass • True changes in matrix concentrations across space

  5. Can anything be done about true spatial variations in concentration? (Statistical) Stratified Sampling Design • Methods for Evaluating the Attainment of Cleanup Standards Volume 1: Soils and Solid Media”, 1989, section 6.4 http://www.cluin.org/download/stats/vol1soils.pdf • Guidance on Choosing a Sampling Design for Environmental Data Collection (EPA QA/G-5S), 2002, Chap 6. http://www.epa.gov/quality/qs-docs/g5s-final.pdf • Data Quality Assessment: Statistical Methods for Practitioners (EPA QA/G-9S), 2006, section 3.2.1.3 http://www.epa.gov/quality/qs-docs/g9s-final.pdf • Purpose: determine the overall mean & UCL for a decision unit (DU) when different sections of the DU have different means & standard deviations (SDs).

  6. ** 1100 1040 18 * 20 * 25 * 22 * 16 * 15 * 21 * “Dividing by 12” assumes equal weight is given to each sample (1/12th of total area) What Makes a Stratified Design Different? 120 * 184 * 155 * To calculate average over the entire area, routine practice is that data go straight into a database, and then… Sum(all) = 2736; then 2736 ÷ 12 = 228 ppm

  7. 1100 1040 * 5% of area; ave = 1070 * 120 * 184 * 155 * 18 * 20% of area ave = 153 20 * 25 * 75% of area ave = 20 22 * 16 * 15 * 21 * Area High Mid Low Routine Stratified Mean 1070 153 20 228 99 SD 42 32 4 398 80 95% UCL 434 (Δ=196) 143 (Δ=44) But the CSM supports partitioning the site into 3 distinct portions based on similar populations 20(0.75) + 153(0.20) + 1070(0.05) = 99 ppm A spatially weighted mean makes a difference!

  8. Basic Principles of a Stratified Sampling Design The CSM is the basis for defining both the DU & its strata • Decision Unit (DU) = a unit for which a decision is made: a single drum, a batch of drums, risk exposure unit, remediation unit, etc. • The DU is the volume & dimensions over which an average conc is desired • Strata are created by different release or transport mechanisms – cause different contaminant patterns in within the DU • Target properties like conc level & variability differ from strata to strata w/in the DU

  9. Basic Principles (cont’d) • DU is delineated (stratified) into non-overlapping subsections according to the CSM • Each stratum’s area/volume is recorded as a fraction of the DU’s area/volume • Each stratum’s conc mean & SD determined • The means & SDs are weighted and mathematically combined  overall mean & UCL for the DU • Can apply stratification to data analysis even if not planned into sampling, but must have spatial info & final CSM available

  10. Benefits of a Stratified Sampling Design • Small areas of very high or low conc do not bias the overall mean of the DU. • Reduces variability (SD) in the DU data set • Reduces statistical uncertainty (as distance between mean & UCL) • Preserves spatial information to identify source/transport mechanisms & support remedial design.

  11. Case Example: XRF with stratified sampling design Properties in old town near Pb battery recycling plant XRF Pb data from bagged soil samples (~300 gram) Plastic bag of soil

  12. Data Collection Design • Property divided into 3 sections (strata) • Front yard (likely “same” conc within & own SD) • Side yard (ditto) • Back yard (ditto) • Each stratum 5 ~equal subsections (sample units) • 1 grab (or MIS) sample (300-400 g) into plastic bag • 5 sample units/stratum or 15 sample units/DU (the EU) Decision Goals • Resolve confusion over past conflicting data. • Determine mean (95% UCL) for exposure unit (entire yard): 500 ppm risk-based A/L; if over, cleanup high contamination areas • Pb source? Suggested by spatial contaminant pattern (does facility have liability?)

  13. { Side Yard: 5 Bagged Samples { { Front Yard: 5 Samples Back Yard: 5 Samples House Footprint Preliminary CSM of Simplified Property Action Level (entire yard) = 500 ppm Area fraction = 0.25 Area fx = 0.15 Area fx = 0.60 Potential release: Traffic (facility truck, Pb gasoline); Pb house paint; facility’s atmospheric deposition; combination. Expected Pb conc: Higher. Potential release: Pb paint; atmos dep. Pb conc: Uncertain (near road, house?) Potential release: Pb paint (near structures); atmos dep. Expected Pb conc: Lower.

  14. XRF Bag Analysis • 4 30-sec XRF readings on bag • (2 on front & 2 on back) • Results entered real-time into pre-programmed spreadsheet • Spreadsheet immediately calculates: • ave & SD for each bag • ave & SD within each strata (yard section), • ave & UCL for the decision unit (entire property). • the greater of within-bag vs. between-bag variability • IFstatistical uncertainty interferes w/ desired decision confidence for DU: • Use #4 & a series of decision trees to reduce statistical uncertain until confident decision possible

  15. Minimizing Variability Improves Statistical Confidence in EPCs NOTE: “Routine” calculation applies same weighting to data points & database loses their spatial representativeness Note: ½ CI width = mean-to-UCL width

  16. House Footprint Data Used to Mature the CSM Preliminary CSM: an informed hypothesis about strata boundaries Mature CSM: Data confirms or modifies hypothesis about strata boundaries

  17. Progressive Data Uncertainty Management * Normal z-distribution used for the XRF instrument’s counting statistics, rest of rows use the t-distribution

  18. Questions ? Deana M. Crumbling, M.S. U.S. EPA, Office of Superfund Remediation & Technology Innovation 1200 Pennsylvania Ave., NW (5203P) Washington, DC 20460 PH: (703) 603-0643 crumbling.deana@epa.gov www.triadcentral.org

More Related