220 likes | 352 Views
Summary of the presentation. Objectives and evolution of the software Software installation pre-requisites Data needed for Genesees Input data sets (characteristics, controls) Output: tables, file formats, data sets Structural Business Statistics (SBS) surveys using Genesees
E N D
Summary of the presentation • Objectives and evolution of the software • Software installation pre-requisites • Data needed for Genesees • Input data sets (characteristics, controls) • Output: tables, file formats, data sets • Structural Business Statistics (SBS) surveys using Genesees • Population of interest - Business Register • Domains of interest • SME Sampling strategy (current) • Variables of interest • Case study GENEralised software for Sampling Estimates and Errors in Surveys
Objectives and evolution of the software (1/2) • Need to estimate variables of interest for social and economic statistics • Guarantee coherence among estimates in time and space • Improve quality of data produced (for example, in accordance to SBS Council Regulation) • Methodology (Deville and Särndal, 1992) • Implemented by Falorsi P.D. – Falorsi S.. GENEralised software for Sampling Estimates and Errors in Surveys
Objectives and evolution of the software (2/2) • Genesees prototype for social statistics • Genesees prototype for enterprises statistics (1992 as first reference year) • Several contributions to the development of the software have thereafter been provided by other Istat researchers • Delivery of the new releases is made regularly • Genesees is currently used for estimation in almost all Istat surveys GENEralised software for Sampling Estimates and Errors in Surveys
Software installation pre-requisites • SAS for Windows • SAS Language, Macro, IML, Stat, Graph • HD ≥ 4 Mb; RAM ≥ 64 Mb How to download Genesees: • http://www.istat.it/Metodologi/index.htm • then select: “Metodi e Software per le indagini statistiche” • download and then unzip the file “Genesees3.zip” on the directory c:\Genesees • E-mail to mts-f@istat.it for the starting password • will inform you about the new releases of the software GENEralised software for Sampling Estimates and Errors in Surveys
Data needed for Genesees • Frame (example: Business Register) → to get the known totals of auxiliary variables as a reference structure • Survey respondent units → to compute the initial sampling weight correction factor and then to assign the final sampling weight to each unit GENEralised software for Sampling Estimates and Errors in Surveys
Input data sets (characteristics) • Input SAS data sets: • (“Noti”; “Inp”) • “Noti”: (var. name≤8 char.) • Planned population = domain of interest: (alfanum. var.; var. ≤15 char.) • Totals of auxiliary variables: (num. var.; at least 1 var.) • “Inp”: (var. name≤8 char.) • Id. Code (num. var.) • Planned population (as in “Noti”) • Auxiliary variables: (num. var.) (have to be inputted in the same order as in “Noti”) • Coef = initial weight (adjusted for unit non response); (num. var.) • Ck = “distance weight”: (num. var.); not necessary GENEralised software for Sampling Estimates and Errors in Surveys
Input data sets (controls) • “Noti”: • Planned popul. = . → Procedure stops → data set “Noti-miss” • Totals of aux. var. = . → 0 • “Inp”: • Id. Code = . → Procedure stops → data set “Missing” • Id. Code = double → data set “Codici-doppi” • Auxiliary variables = . → 0 • Coef = . → 1 (no controls) • Ck = . → 1 GENEralised software for Sampling Estimates and Errors in Surveys
Output tables • Output tables (summary descriptive statistics related to the calibration estimators process): • Table 1: Statistics on estimates and final weights for planned popul.; • Table 2: Statistics on initial weights correction factors; • Table 3: Statistics on estimates and initial weights; • Table 4: Prefixed parameters for the estimation iterative procedure; • Table 5: Known totals, direct and final estimates, and differences; • Tabulate 1: Controls on the domains: known totals, direct estimates, ratios between known totals and direct estimates, sample totals; • Tabulate 2: Sample size (respondents) and population estimate with direct weights; • Tabulate 3: Controls on domains without sample units. GENEralised software for Sampling Estimates and Errors in Surveys
Output file formats • Output file formats • “genesees.log” (SAS log) • “stampa1.txt” – “stampa6.txt” (Tables) • “stampe stime.htm” (Tables) • Data sets SAS (“*.sas7bdat”) GENEralised software for Sampling Estimates and Errors in Surveys
Output data sets (1/2) • Diagnostics (errors detected in the input step, if any): • “missing”; (Id. Code = .) • “noti-miss”; (Planned popul. = .) • “vuoti” (domain is present in “Noti” but is not present in “Inp”); • “codici-doppi”; (Id. Code = double) • “csenzat” (domain is present in “Inp” but is not present in “Noti”); • “savestime” (shows parameters inputted) GENEralised software for Sampling Estimates and Errors in Surveys
Output data sets (2/2) • Statistics and final weights: • “Pesifin” (initial w.; corr. factor; final w.; id.; conta; domain); • “stat” • conta; • max; min; sum; mean; var; cv; (with reference to initial weights, correction factor and final weights) • Iterations; maxiter; converge; constraints (c2); sample units in the domain (r2); dist. func.; • ”stimedir”: domain; aux. var. totals; conta; • ”stimefin”: known total; direct estimate; final estimate; conta; difference between final estimate and known total GENEralised software for Sampling Estimates and Errors in Surveys
Structural Business Statistics (SBS) Surveysusing Genesees (1/2) • Small and Medium Enterprises (SME) Survey • Information and Communication Technologies (ICT) Survey • Structure of Earnings Survey (SES) • Labor Cost Survey (LCS) • Prodcom • SBS Preliminary Estimates • … GENEralised software for Sampling Estimates and Errors in Surveys
Structural Business Statistics (SBS) Surveysusing Genesees (2/2) Estimation of economic variables on enterprises according to: • Istat traditional data production on enterprises • Structural Business Statistics (SBS) EU Council Regulation No 58/97 • Preliminary estimates (1 estimation domain; t + 10 months) • Final estimates (3 estimation domains; t + 18 months) • Quality indicators and specific reports (3 estim.domains; t + 24 months) • Coefficient of Variation - CV (3 domains); • Item and unit non response rate (1 domain); • Specific reports on survey strategy and principal economic activity. t = year of reference GENEralised software for Sampling Estimates and Errors in Surveys
Population of interest (1/2) GENEralised software for Sampling Estimates and Errors in Surveys
Population of interest (2/2) GENEralised software for Sampling Estimates and Errors in Surveys
Business Register ASIA • Data sources: • Tax Register, Chambers of Commerce, Social Security, Work Accident Insurance, Electric Power Board, SEAT telephone directory • Statistical and probabilistic procedure for enterprises’ main economic activity detection • Variables in the register are the result of standardization, normalization and integration of information provided by administrative sources GENEralised software for Sampling Estimates and Errors in Surveys
Domains of study (SBS final estimates) GENEralised software for Sampling Estimates and Errors in Surveys
SME Sampling strategy (current) • N ≈ 3,723,000 enterprises (Business Register) • (enterprises <10 persons employed cover 94.8% of the total enterprises and 47.8% of the total employment) • Stratified simple random sample • H ≈ 26,000 strata (NACE Rev.1.1, Size class, Region) • n ≈ 120,000 (negative coordination with other SBS Surveys, multivariable and multidomain sample allocation) • Survey technique: postal questionnaire; 2 call-backs • Calibration estimators methodology (Deville and Särndal,1992) GENEralised software for Sampling Estimates and Errors in Surveys
Variables of interest • Turnover • Value added at factor cost • Employment • Total purchases of goods and services • Personnel costs • Wages and salaries • Production value • ….. Totals of variables of study are estimated with reference to subpopulation of interest (domains), as requested by SBS EU Regulation GENEralised software for Sampling Estimates and Errors in Surveys
Case study GENEralised software for Sampling Estimates and Errors in Surveys
Starting picture GENEralised software for Sampling Estimates and Errors in Surveys
Thank you! GENEralised software for Sampling Estimates and Errors in Surveys