1 / 22

Using unequal probability sampling to limit antici-pated variances of regression estimators

Using unequal probability sampling to limit antici-pated variances of regression estimators. Anders Holmberg Department of Research & Development Statistics Sweden SE-701 89 Örebro Sweden Tel: +46 19 176905 Fax: +46 19 177084 E-mail: Anders.Holmberg@scb.se. Anders Holmberg ICES III 07.

gur
Download Presentation

Using unequal probability sampling to limit antici-pated variances of regression estimators

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using unequal probability sampling to limit antici-pated variances of regression estimators Anders Holmberg Department of Research & Development Statistics Sweden SE-701 89 Örebro Sweden Tel: +46 19 176905 Fax: +46 19 177084 E-mail: Anders.Holmberg@scb.se Anders Holmberg ICES III 07

  2. Outline • Background • The problem • Some theory • Auxiliary Information • An application in a business survey • Comparisons and Results • Comments Anders Holmberg ICES III 07

  3. Background(1) • Prepare the sampling frame • Derive and analyse diagnostic data • Decide on a sampling design, sampling scheme and estimator • Launch the survey Anders Holmberg ICES III 07

  4. Background(2) • Prerequisites • A well defined business population • Several parameters of interest • Design-based inference • An up-to-date frame from the business register • Admin. data available as auxiliary information • Attempt to find the most efficient/(robust) design Anders Holmberg ICES III 07

  5. (t-1) (t-1) • Number of employees (u1) • Turnover (u2) • Personnel expenses (u3) • (4) Investments (u4) • Number of employees (u1) • Turnover (u2) • Personnel expenses (u3) • (4) Investments (u4) Background(6) (t-2) • Number of employees (u1) • Turnover (u2) • Personnel expenses (u3) • (4) Investments (u4) (t) • Number of employees (y1) • Turnover (y2) • Personnel expenses (y3) • (4) Investments (y4) Anders Holmberg ICES III 07

  6. Minimum of is Optimal design in the single variable case A design that minimizes is such that Brewer, Hajek, Cassel et al., Rosén Anders Holmberg ICES III 07

  7. ’Guesstimate’ to find size measures Population plot E.g. if : Anders Holmberg ICES III 07

  8. (t-1) (t-1) • Number of employees (u1) • Turnover (u2) • Personnel expenses (u3) • (4) Investments (u4) • Number of employees (u1) • Turnover (u2) • Personnel expenses (u3) • (4) Investments (u4) The multivariate case? (t-2) • Number of employees (u1) • Turnover (u2) • Personnel expenses (u3) • (4) Investments (u4) (t) • Number of employees (y1) • Turnover (y2) • Personnel expenses (y3) • (4) Investments (y4) Anders Holmberg ICES III 07

  9. The least we should do is to analyse the various designs’ possible effects on different estimators, before we make the design choice. Derive inclusion probabilities as a function of standardized (univariate) size measures Maximal Brewer selection The multivariate case Anders Holmberg ICES III 07

  10. Minimize under the restrictions The multivariate case Try to find a design that in some sence is optimal for all important parameters? There is no evident criterion of optimality, but some are better than others. Anders Holmberg ICES III 07

  11. The multivariate case some optimisation approaches Minimizing a weighted sum of relative efficiency losses: is minimized when Scale effects are neutralized, the relations between the ANVq :s and the corresponding single parameter minimum values (The Brewer selection) are used . Anders Holmberg ICES III 07

  12. The multivariate case some optimisation approaches If we want to put restrictions on certain parameters, e.g. Then a design that minimizes ANOREL can be obtained through non-linear programming Optimization model: Anders Holmberg ICES III 07

  13. The 4 variables studied for three branches (strata) SNI25: Manufacturers of food products & beverages N=749, SNI28: Manufacturers of metal goods (except machines and devices) N=2292, SNI33: Manufacturers of optical instruments N=323, An Application Analysis and comparisons made on admin data from previous reference times. Plots, Estimated correlations and gammacoefficients Anders Holmberg ICES III 07

  14. An Application • A common ratio model pictures the relationships reasonably well if the corresponding older variable is used as regressor variable. (Strongest pairwise correlation over branches and time, although doubts exist for the investment variable) • Estimates of the gammacoefficient are sensitive. • Estimates ranged between 0.2 and 0.9 and sometimes deteriorated!? • For investments very weak or no heteroscedasticity • For the other three variables, • “cannot be ruled out” and is simple as a guesstimate Anders Holmberg ICES III 07

  15. An Application Strata Anders Holmberg ICES III 07

  16. An Application • Computations of inclusion probabilities and the anticipated variances using the Brewer selection (Maximal brewer selection) • Computation of the optimisation based approaches, with the extra condition that Anders Holmberg ICES III 07

  17. Food & Beverages

  18. Optical Instruments

  19. Metal goods Maximal Brewer selection satisfies the criteria but with 25% larger sample

  20. Does it work on the estimator variances? • In most cases we will never know • However, for these variables we can check against admin. data (coming in 1.5 year later) • Using • Where is the Taylor expanded variance of the ratio estimator under poisson sampling Anders Holmberg ICES III 07

  21. Metal goods Ratios of the Taylor expanded variances to the smallest variance of each estimator (%)

  22. Summary • Carefully choosing appropriate size measures to get limits anticipated variances of regression estimators. And Brewer’s results can be extended to a multivariate situation. • If there is a multivariate issue and you intend to use auxiliary information in the design, diagnostic computations are important. • With an optimization approach we know what we are aiming to minimize and with the non-linear programming approach some practical trouble in designing a pps-sample are avoided. Anders Holmberg ICES III 07

More Related