1 / 45

How Many Samples do I Need? Part 1

DQO Training Course Day 1 Module 4. How Many Samples do I Need? Part 1. Presenter: Sebastian Tindall. 60 minutes (15 minute 1st Afternoon Break). Topics to Discuss in Module 4. How many samples based on Census Sampling Types of decision error Definitions of common statistical terms.

rocco
Download Presentation

How Many Samples do I Need? Part 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DQO Training Course Day 1 Module 4 How Many Samples do I Need?Part 1 Presenter: Sebastian Tindall 60 minutes (15 minute 1st Afternoon Break)

  2. Topics to Discuss in Module 4 • How many samples based on • Census • Sampling • Types of decision error • Definitions of common statistical terms

  3. How Many Samples do I Need? n = (total $)  ($ per sample) Quick & Dirty Method n = 5 Budget Method

  4. How Many Samples do I Need? How will the data be used? It depends! What is the decision? What is the tolerance for mistakes? What is the underlying variation in the material being sampled?

  5. How Many Samples do I Need? (The Real Answer) Just Enough!

  6. How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!

  7. Decisions with Absolute Certainty • Requires knowing the “true condition” of the population in question • Perform a census • Collect and analyze every possible member of the population in question

  8. Decisions with Absolute Certainty (cont.) • Population • Universe of items (elements) within the spatial boundary • All the possible soil samples in the Smith’s backyard • All the people in the U.S.A. • Translation: you have to count/measure (sample) EVERY single member of the population

  9. Football Field One-Acre 30'0" Football Field

  10. Number of Samples in a One-Acre Field How many surface soil samples can I take from a one-acre field? The perimeter of a one-acre field measures 272.25 feet by 160 feet. If one surface soil sample = 2.5” x 2.5” x 6” deep, then…. ...there are = 1,000,000 possible surface soil samples in a one-acre field.

  11. Cost of Sampling Entire One-Acre Field How much would it cost to know the true condition of the one-acre field? If it costs $3000 to test one surface soil sample, it would cost$3,000,000,000 to test all possible population units.

  12. Testing All Possible Samples CENSUS • Testing all possible population units (samples) is the ONLY way to know the true condition of the site with absolute certainty • However, time and money considerations usually prevent us from doing this

  13. Decisions with Absolute Certainty • Perform a census • totally impractical • Therefore, we can never make a decision with absolute certainty • So what’s left to do?

  14. Testing a Few Samples(from the larger population) ESTIMATION • Estimates of the true condition of the site are usually made from a few (representative) samples • Taking a few samples (making a few measurements) and using them to represent the site • Make inferences (even sweeping claims) about the population of interest based on these few samples

  15. The Process of Estimation • An estimate is just an educated guess based on incomplete information • Educated guesses will be wrong, to some degree • In other words, the process of estimation contains inherent errors

  16. Estimation Errors Are unavoidable! • Are NOT mistakes. They do not suggest that anything was done improperly • Are an inherent part of the process of estimation • Are simply deviations from the true condition of the site • Introduce uncertainty into the decision-making process

  17. Consequences of Uncertainty Estimation Errors Decision Errors • Decision errors are true mistakes • Examples: • Walking away from a dirty site • Cleaning up a clean site • Decision errors can be managed

  18. Decision Errors • Are acceptable or tolerable …within limits • We set tolerable limits on the percentage of time we are willing to: • Walk away from a dirty site • Clean up a clean site

  19. Where do errors occur? Planning Sampling Analysis Data Vs Decision

  20. Population Everyone or everything of interest Example: All the people in this class Sample Some subset of the population Example: Five people randomly chosen from the class Definition of Terms

  21. Population Parameter The true value of the population characteristic (e.g., age) that can only be known if all possible samples are measured Example: true mean age of all the people in the class, calculated using data from every member of the population Sample Statistic The estimated value of the population characteristic that is calculated from sample data Example: estimate of the true mean age of all people in the class, calculated using data from a subset (sample) of the population Definition of Terms

  22. Population Parameter Represents “true condition” of the population Decisions can be made with 100% certainty (0% uncertainty) Sample Statistic Represents “estimated condition” of the population Decision cannot be made with 100% certainty Comparison

  23. What is the true mean age in this class? What is the estimated mean age in this class? Randomly select 5 ages 2nd estimated mean age in this class? Randomly select 15 ages (See Computer Age Demo) Class Question?

  24. True Mean Age of All the People in This Class • In this case - where we are only interested in measuring a small group of people who are all in the same room at the same time - it is not too difficult to determine the true mean age with 100% certainty. But: • What if some people failed to respond? • What if some people “fudged” a little? • What if some of the response forms got lost?

  25. Types of Decision Errors • Before we can talk about acceptable limits for making decision errors, we must first understand what correct decisions and decision errors look like and define some terms • There are two types of correct decisions and two types of decision errors that can be made

  26. Graph of Perfect Decision Making 1.0 0.5 0.0 Ideal Decision Rule Chance of Deciding Site is Dirty 6 pCi/g Action Level Low True Mean 226Ra concentration High

  27. Graph of Typical Decision Making 1.0 0.5 0.0 Typical Curve Chance of Deciding Site is Dirty 6 pCi/g Action Level Low True Mean 226Ra Concentration High

  28. Null Hypothesis: The Site is dirty. True State of Site Site is clean Site is dirty The Gray Region 1.0 Probability of deciding that the site is dirty Typical Curve 0.5 0.0 75 100 Lower Bound of Gray Region Action Level True mean COPC Concentration Decision Performance Goal Diagram Walk away from site Clean up site Alternative Action

  29. Action Level UCL 1A UCL 1B X A 75 110 100 95 Decision-Making Procedure: Apply Decision Rule PSQ Is Site clean? Is Site dirty? ∞ DL 95 UCL% COPC Concentration Walk away from site Clean up site Alternative Action

  30. Action Level X B UCL B 110 120 100 Decision-Making Procedure: Apply Decision Rule PSQ Is Site clean? Is Site dirty? ∞ DL 95 UCL% COPC Concentration Walk away from site Clean up site Alternative Action

  31. True Mean Sample Mean UCL Deviation Decision-Making Procedure: Apply Decision Rule PSQ Conclusion: Site is dirty. Is Site clean? Is Site dirty? Action: Clean up a dirty site. A correct decision. ∞ DL 100 Action Level 95 UCL% COPC Concentration Walk away from site Clean up site Alternative Action

  32. True Mean Sample Mean UCL Deviation Decision-Making Procedure: Apply Decision Rule PSQ Conclusion: Site is clean. Is Site clean? Is Site dirty? Action: Walk away from a dirty site. An incorrect decision. ∞ DL 100 Action Level 95 UCL% COPC Concentration Walk away from site Clean up site Alternative Action

  33. True Mean Sample Mean UCL Deviation Decision-Making Procedure: Apply Decision Rule Conclusion: Site is clean. PSQ Is Site clean? Is Site dirty? Action: Walk away from a clean site. A correct decision. ∞ DL 100 Action Level 95 UCL% COPC Concentration Walk away from site Clean up site Alternative Action

  34. True Mean Sample Mean UCL Deviation Decision-Making Procedure: Apply Decision Rule PSQ Conclusion: Site is dirty. Is Site clean? Is Site dirty? Action: Clean up a clean site. An incorrect decision. ∞ DL 100 Action Level 95 UCL% COPC Concentration Walk away from site Clean up site Alternative Action

  35. True Mean Sample Mean UCL Deviation The Gray Region Null Hypothesis: The Site is dirty. True State of Site Site is clean Site is dirty When the True Mean is well above the Action Level... 1.0 Probability of deciding that the True Mean is greater that or equal to the Action Level ... then there should be high a probability that the Sample Mean UCL will also be above the Action Level... 0.5 ... and it is highly likely that we will correctly decide to clean up a dirty site. 0.0 Lower Bound of GrayRegion 75 100 Action Level True mean COPC Concentration Walk away from site Clean up site Alternative Action

  36. True Mean Sample Mean UCL Deviation Null Hypothesis: The Site is dirty. The Gray Region True State of Site If the True Mean is well below the Lower Bound of the Gray Region... ... then there should be a very low probability that the Sample Mean UCL will be above the Action Level... Site is clean Site is dirty 1.0 Probability of deciding that the site is dirty 0.5 0.0 Lower Bound of GrayRegion 75 100 Action Level True mean COPC Concentration ... and it is highly unlikely that we will incorrectly decide to clean up a clean site. Walk away from site Clean up site Alternative Action

  37. True Mean Sample Mean UCL Deviation Null Hypothesis: The Site is dirty. The Gray Region True State of Site ... then there is an increased probability that the Sample Mean UCL will be above the Action Level... When the True Mean is IN the gray region….. Site is clean Site is dirty 1.0 Probability of deciding that the site is dirty 0.5 ... and that we will agree to incorrectly decide to clean up a clean site. 0.0 Lower Bound of GrayRegion 75 100 Action Level True mean COPC Concentration Walk away from site Clean up site Alternative Action

  38. Null Hypothesis: The Site is dirty. True State of Site Site is clean Site is dirty 1.0 Typical Curve The Gray Region 0.5 Probability of deciding that the site is dirty 0.0 Lower Bound of Gray Region 75 100 Action Level True mean COPC Concentration Decision Performance Goal Diagram Walk away from site Clean up site Alternative Action

  39. Unnecessary Disposal and/or Cleanup Cost Threatto Public Healthand Environment Sampling and Analyses Cost Sampling and Analyses Cost $ $ $ $ Managing Uncertainty is a Balancing Act PRP 1 Focus Regulatory 1 Focus

  40. Key Points • We will never know the true condition of the site - time and money prevent this • Therefore we must estimate the true condition through sampling • Estimates based on samples are not factual statements about the site. They are educated guesses • Estimates must be in error - because they use incomplete information

  41. Key Points (cont.) • Errors are not mistakes - just deviations from the truth • Errors (deviations) introduce uncertainty into the decision-making process • Errors and uncertainty can be managed so that you can still get the job done and prove that you did it

  42. Key Points (cont.) • The DQO Process is designed to help you manage uncertainty and: • Get the job done efficiently • Prove that you did it defensibly

  43. Primary Benefit of the DQO Process: Managing uncertainty through systematic planning. “FAILING TO PLAN….. IS PLANNING TO FAIL”

  44. How Many Samples do I Need? REMEMBER: HETEROGENEITY IS THE RULE!

  45. End of Module 4 Thank you Summary of Parts 1, 2, 3 will be at the end of Module 6 Questions? We will now take a 15 minute break. Please be back in 15 minutes.

More Related