1 / 22

Summary of the presentation

Learn about Genesees software, installation steps, data requirements, input data sets, output formats, and its use in structural business statistics surveys. Explore its evolution, objectives, and applications in various domains.

Download Presentation

Summary of the presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summary of the presentation • Objectives and evolution of the software • Software installation pre-requisites • Data needed for Genesees • Input data sets (characteristics, controls) • Output: tables, file formats, data sets • Structural Business Statistics (SBS) surveys using Genesees • Population of interest - Business Register • Domains of interest • SME Sampling strategy (current) • Variables of interest • Case study GENEralised software for Sampling Estimates and Errors in Surveys

  2. Objectives and evolution of the software (1/2) • Need to estimate variables of interest for social and economic statistics • Guarantee coherence among estimates in time and space • Improve quality of data produced (for example, in accordance to SBS Council Regulation) • Methodology (Deville and Särndal, 1992) • Implemented by Falorsi P.D. – Falorsi S.. GENEralised software for Sampling Estimates and Errors in Surveys

  3. Objectives and evolution of the software (2/2) • Genesees prototype for social statistics • Genesees prototype for enterprises statistics (1992 as first reference year) • Several contributions to the development of the software have thereafter been provided by other Istat researchers • Delivery of the new releases is made regularly • Genesees is currently used for estimation in almost all Istat surveys GENEralised software for Sampling Estimates and Errors in Surveys

  4. Software installation pre-requisites • SAS for Windows • SAS Language, Macro, IML, Stat, Graph • HD ≥ 4 Mb; RAM ≥ 64 Mb How to download Genesees: • http://www.istat.it/Metodologi/index.htm • then select: “Metodi e Software per le indagini statistiche” • download and then unzip the file “Genesees3.zip” on the directory c:\Genesees • E-mail to mts-f@istat.it for the starting password • will inform you about the new releases of the software GENEralised software for Sampling Estimates and Errors in Surveys

  5. Data needed for Genesees • Frame (example: Business Register) → to get the known totals of auxiliary variables as a reference structure • Survey respondent units → to compute the initial sampling weight correction factor and then to assign the final sampling weight to each unit GENEralised software for Sampling Estimates and Errors in Surveys

  6. Input data sets (characteristics) • Input SAS data sets: • (“Noti”; “Inp”) • “Noti”: (var. name≤8 char.) • Planned population = domain of interest: (alfanum. var.; var. ≤15 char.) • Totals of auxiliary variables: (num. var.; at least 1 var.) • “Inp”: (var. name≤8 char.) • Id. Code (num. var.) • Planned population (as in “Noti”) • Auxiliary variables: (num. var.) (have to be inputted in the same order as in “Noti”) • Coef = initial weight (adjusted for unit non response); (num. var.) • Ck = “distance weight”: (num. var.); not necessary GENEralised software for Sampling Estimates and Errors in Surveys

  7. Input data sets (controls) • “Noti”: • Planned popul. = . → Procedure stops → data set “Noti-miss” • Totals of aux. var. = . → 0 • “Inp”: • Id. Code = . → Procedure stops → data set “Missing” • Id. Code = double → data set “Codici-doppi” • Auxiliary variables = . → 0 • Coef = . → 1 (no controls) • Ck = . → 1 GENEralised software for Sampling Estimates and Errors in Surveys

  8. Output tables • Output tables (summary descriptive statistics related to the calibration estimators process): • Table 1: Statistics on estimates and final weights for planned popul.; • Table 2: Statistics on initial weights correction factors; • Table 3: Statistics on estimates and initial weights; • Table 4: Prefixed parameters for the estimation iterative procedure; • Table 5: Known totals, direct and final estimates, and differences; • Tabulate 1: Controls on the domains: known totals, direct estimates, ratios between known totals and direct estimates, sample totals; • Tabulate 2: Sample size (respondents) and population estimate with direct weights; • Tabulate 3: Controls on domains without sample units. GENEralised software for Sampling Estimates and Errors in Surveys

  9. Output file formats • Output file formats • “genesees.log” (SAS log) • “stampa1.txt” – “stampa6.txt” (Tables) • “stampe stime.htm” (Tables) • Data sets SAS (“*.sas7bdat”) GENEralised software for Sampling Estimates and Errors in Surveys

  10. Output data sets (1/2) • Diagnostics (errors detected in the input step, if any): • “missing”; (Id. Code = .) • “noti-miss”; (Planned popul. = .) • “vuoti” (domain is present in “Noti” but is not present in “Inp”); • “codici-doppi”; (Id. Code = double) • “csenzat” (domain is present in “Inp” but is not present in “Noti”); • “savestime” (shows parameters inputted) GENEralised software for Sampling Estimates and Errors in Surveys

  11. Output data sets (2/2) • Statistics and final weights: • “Pesifin” (initial w.; corr. factor; final w.; id.; conta; domain); • “stat” • conta; • max; min; sum; mean; var; cv; (with reference to initial weights, correction factor and final weights) • Iterations; maxiter; converge; constraints (c2); sample units in the domain (r2); dist. func.; • ”stimedir”: domain; aux. var. totals; conta; • ”stimefin”: known total; direct estimate; final estimate; conta; difference between final estimate and known total GENEralised software for Sampling Estimates and Errors in Surveys

  12. Structural Business Statistics (SBS) Surveysusing Genesees (1/2) • Small and Medium Enterprises (SME) Survey • Information and Communication Technologies (ICT) Survey • Structure of Earnings Survey (SES) • Labor Cost Survey (LCS) • Prodcom • SBS Preliminary Estimates • … GENEralised software for Sampling Estimates and Errors in Surveys

  13. Structural Business Statistics (SBS) Surveysusing Genesees (2/2) Estimation of economic variables on enterprises according to: • Istat traditional data production on enterprises • Structural Business Statistics (SBS) EU Council Regulation No 58/97 • Preliminary estimates (1 estimation domain; t + 10 months) • Final estimates (3 estimation domains; t + 18 months) • Quality indicators and specific reports (3 estim.domains; t + 24 months) • Coefficient of Variation - CV (3 domains); • Item and unit non response rate (1 domain); • Specific reports on survey strategy and principal economic activity. t = year of reference GENEralised software for Sampling Estimates and Errors in Surveys

  14. Population of interest (1/2) GENEralised software for Sampling Estimates and Errors in Surveys

  15. Population of interest (2/2) GENEralised software for Sampling Estimates and Errors in Surveys

  16. Business Register ASIA • Data sources: • Tax Register, Chambers of Commerce, Social Security, Work Accident Insurance, Electric Power Board, SEAT telephone directory • Statistical and probabilistic procedure for enterprises’ main economic activity detection • Variables in the register are the result of standardization, normalization and integration of information provided by administrative sources GENEralised software for Sampling Estimates and Errors in Surveys

  17. Domains of study (SBS final estimates) GENEralised software for Sampling Estimates and Errors in Surveys

  18. SME Sampling strategy (current) • N ≈ 3,723,000 enterprises (Business Register) • (enterprises <10 persons employed cover 94.8% of the total enterprises and 47.8% of the total employment) • Stratified simple random sample • H ≈ 26,000 strata (NACE Rev.1.1, Size class, Region) • n ≈ 120,000 (negative coordination with other SBS Surveys, multivariable and multidomain sample allocation) • Survey technique: postal questionnaire; 2 call-backs • Calibration estimators methodology (Deville and Särndal,1992) GENEralised software for Sampling Estimates and Errors in Surveys

  19. Variables of interest • Turnover • Value added at factor cost • Employment • Total purchases of goods and services • Personnel costs • Wages and salaries • Production value • ….. Totals of variables of study are estimated with reference to subpopulation of interest (domains), as requested by SBS EU Regulation GENEralised software for Sampling Estimates and Errors in Surveys

  20. Case study GENEralised software for Sampling Estimates and Errors in Surveys

  21. Starting picture GENEralised software for Sampling Estimates and Errors in Surveys

  22. Thank you! GENEralised software for Sampling Estimates and Errors in Surveys

More Related