1 / 16

Estimation and Weighting, Part I

Estimation and Weighting, Part I. Goal of Estimation. Minimize a survey’s total error Sampling Error is error arising solely from the sampling process (measure: variance) Mainly a function of sample size Surveys are also subject to biases from nonsampling errors such as:

gwyn
Download Presentation

Estimation and Weighting, Part I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Estimation and Weighting, Part I

  2. Goal of Estimation Minimize a survey’s total error • Sampling Error is error arising solely from the sampling process (measure: variance) • Mainly a function of sample size • Surveys are also subject to biases from nonsampling errors such as: • Coverage errors and non-probability sampling • Response errors • Nonresponse

  3. Typical Estimation Steps The estimation steps for a typical household survey avoid or help control some nonsampling errors • Editing and Imputation are aimed at controlling response errors • Basic Weighting based on probabilities of selection produces essentially unbiased estimates when there is 100% response and no response error • Nonresponse Adjustment helps avoid some obvious biases that arise when nonrespondents are ignored • Population Controls help minimize some coverage problems

  4. Editing and Imputation Editing • deleting or correcting unacceptable data values • coding/combining data to classify respondents Imputation – insert values for missing data • for missing items (imputation is common) • For missing HH or persons (not used as often) • modeling methods • Hot deck methods

  5. Item Nonresponse Imputation When a household is interviewed and a small amount of data is not obtained for a person, imputing for the missing data creates a complete data set. Hot Deck Method: Use answers from another similar unit to impute answers for an item nonresponse – “nearest neighbor” Modeling Method: Mathematically impute an answers for an item nonresponse

  6. Example of Imputation Suppose a woman aged 29, was employed last month. This month, we were not able to obtain her labor force status. Construct a “transition matrix” using records of “similar” persons with labor force status coded in both months – use females aged 24-45.

  7. Example of Imputation Based on Frequencies, Compute Probabilities

  8. Example of Imputation • Generate a random number between 0 and 1 • If rn = .7221, for example, then rn falls in the range [0, .9449] and “employed” is imputed for this month • Will happen 94.49% of the time • No guarantee that this is right for the particular data item that is imputed • Imputed data set is complete and preserves known relationships

  9. Example of Imputation Would you impute a labor force status? Maybe not: • Usually a determination will be made concerning how much data is required for a response to be accepted by a survey • For a labor force survey, enough information to determine LF status will probably be required

  10. Purpose of Weighting Estimate the number of persons each person in a sample household represents Each person interviewed helps represent • not-in-sample population of the area (geographic stratum) where the person lives • sample persons not interviewed • Generally, persons of the same age, race, gender, and ethnic origin as the person interviewed

  11. Basic Weights Applied at the household level (all persons in HH have the same basic weight) Inverse of probability of selection In a typical HH sample there are two stages of sampling and two probabilities • 1st stage probability for an EA EAprob • 2nd stage probability for HH in that EA HHprob • TOTprob = EAprob * Hhprob • Baseweight = 1/TOTprob

  12. Base Weights • Self weighting samples are not common • Primary stratifier for HH surveys is geography, such as state • often the base weights in a state are all equal • OR nearly the same • For a self-weighting stratum use N/n: Number N of HHs on the Frame Number n of HHs in the Sample

  13. Example of Basic Weighting

  14. Example of Basic Weighting • Self-weighting within state • State A has N= 500,000 and sample n=2,000 • baseweight = N/n = 500,000/2,000 = 250 • An estimate of employment obtained by multiplying sample count (EMP = 3,000) by the baseweight • 3,000 x 250 = 750,000 • State B has N= 175,000 and sample n=1,750 • baseweight = N/n = 175,000/1,750 = 100 • An estimate of unemployment obtained by multiplying sample count (UE = 250) by the baseweight • 250 x 100 = 25,000

  15. Simple Weighted Estimates Estimate x of a Total X • A Simple Weighted Estimate adds persons using their weights (wi weight for ith person) • Sum across all persons in the sample • xi is a data value for person i • for example xi = 1 for employed, 0 otherwise

  16. Simple Weighted Estimates Example Continue the previous example for State A • Simple Weighted Estimate of employment xi = 1 for employed, 0 otherwise • Can restrict sum to the 3,000 employed • since xi=0 for the other responding persons

More Related