200 likes | 367 Views
The use of administrative and accounts data for business statistics (ESSnet AdminData). Cristina Casciano, Viviana De Giorgi, Filippo Oropallo Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices First meeting ESSnet on Data Integration
E N D
The use of administrative and accounts data for business statistics (ESSnet AdminData) Cristina Casciano, Viviana De Giorgi, Filippo Oropallo IstatDivision for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices First meeting ESSnet on Data Integration Rome 28 – 29 January 2010
First meeting ESSnet on Data Integration Background • ISTAT is about to start a major research project finalised to support the transition of Italian SBS statistics from a data collection system extensively based on direct reporting (involving around 120.000 companies) to a new survey system, which is largely based on the use of administrative data sources • A more intensive use of administrative data for the compilation of SBS statistics requires to carefully evaluate the most relevant administrative data sources with respect to different types of business population (large companies, small and medium size companies, micro-businesses), and to carefully assess the impact of potential sources of biases in those data • Administrative data sources are currently used for the compilation of SBS statistics to produce preliminary estimates, as requested by the SBS Regulation. They are also used as a complementary source of business data with respect to direct reporting to produce definitive SBS data. They are also used for the construction of Italian Business Register (Asia), including demography, and the Oros database (Employment, wage, salary and social contributions STS) Rome, January 29, 2010
ESSnet AdminData First meeting ESSnet on Data Integration Rome, January 29, 2010
First meeting ESSnet on Data Integration WP5 - Plan of activities for the period 2010-2013 • Assessing the relevance of data matching and definitions inconsistency problems arising when combining multiple administrative data sources (balance sheet data, VAT data, Fiscal authority surveys) with survey data • Design and testing of a methodological approach (and drafting related recommendations) finalized to deal with inconsistency in variables and data matching problems arising from the use of fiscal data in the estimation of SBS data with respect to the smaller size classes (small and micro-businesses) • Designing and testing the appropriate methodological approach (and drafting related recommendations) finalized to statistically amend inconsistency in variables and data matching problems arising from the use of balance sheet data in the estimation of SBS data with respect to the larger size classes (medium and large businesses) • Designing and testing the proper methodological approach (and drafting related recommendations) finalized to properly use complementary fiscal and administrative data sources to estimate specific variables and segments of the SBS target population not covered by the above mentioned data sources. • Defining the final reports and comparison with other EU countries Rome, January 29, 2010
First meeting ESSnet on Data Integration Italian Case – Proposed Actions (First Year) • Integration issues: • Matching issues: detecting different subset of Population (micro-small/medium-large; Unincorporated/Corporated) • Metadata issues: comparison between SBS definitions and Fiscal data • Review of Administrative sources useful to produce Structural Business Statistics (SBSEu reg. 58/97, 410/98, 2700/98, 2056/02, 1670/03, 295/2008) • First step in the reconstruction of the main economic • variables for Small firms by using Fiscal sources Rome, January 29, 2010
First meeting ESSnet on Data Integration Integration issues (1) • Advantages: • Reduce statistical burden • Reduce bias in estimates (due to TMR) • Reduce costs • Timeliness in producing estimates • Drawbacks:- Confidentiality problems related to the Administrative data access - Administrative data are customarily collected for different purposes- No control on data production process at the origin (to check missing values, outliers, etc.). Cooperation with Agencies that provides data should be considered. • - They may refer to legal units not statistical units Rome, January 29, 2010
First meeting ESSnet on Data Integration Integration issues (2) Matching different data sources (statistical/administrative) means tackling a host of issue, e.g.: Identifying business units i.e.find an identifying variable which is a unique key that is a natural join between different sources. In almost all firm databases we choose the fiscal code(available from Asia) Dealing with Matching Problems i.e. whenever a key variable is unavailable or is not sufficient to identify the statistical unit. In case of mis-matches or when sources do not contain the same unit Identifying changes in business units Changes involving a single unit (changes in kind of business classification, in legal form or localisation) Changes in the number of units (death, birth, breaks up and splits off, mergers and acquisitions) Addressing sampling problems When merging survey data with exhaustive data from a subset of the population Reconciling definitions and values among sources Whenever a variable has not the same definition or value across different sources Handling data editing and data reconstruction issues Measurement Errors, Missing data, Outliers etc Rome, January 29, 2010
First meeting ESSnet on Data Integration Review of sources • 1) Fiscal Agency • Fiscal Survey Purpose Aiming to enhance fiscal complianceNot all firms • Tax Return data Unico (personal tax), 770 (witholding tax on employees and temporary workersMore info for micro_firms with simplified bookkeeping. Less info for other firms • VAT data Changes in legal unit and Turnover data • 2) Chambers of Commerce • Balance Sheet Data • All Corporate firms • Better coherence with SBS variables • 3) Social Security Institute • Data from monthly declaration of the enterpriseon employees • All firms with at least 1 employee in a months of the year • Number of employees, typology, wage and salary, social contributions Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using Administrative data (1) First meeting ESSnet on Data Integration Purpose of Administrative sources To support Tax Admin. control action on small and medium firms Population coverage Single ownerships, Partnerships and corporate firms Turnover greater than 30.000€ and less than 7,5 million € Roughly 4 million of records Variables More balance-sheets-comparable variables (Turnover, Value of Production, Intermediate costs, Value Added, Personnel costs, Gross and net operating surplus) Different definition of accounting variables (e.g. Freelancers) Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using Administrative data (2) First meeting ESSnet on Data Integration Coverage analysis by legal type and size class Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using Administrative data (3) First meeting ESSnet on Data Integration List of harmonized variables from various sources defined according the SBS regulation and international accounting standard Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (4) First meeting ESSnet on Data Integration Coverage of the initial sample of SME survey by type of response and administrative data Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (5) First meeting ESSnet on Data Integration Integration scheme Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (6) First meeting ESSnet on Data Integration Calibration and bias estimation Final estimates on the subset of respondents (S1) Final estimates on the integrated sample (S2) The difference in the final estimation is equal to and subtract If we add where is zero for all units of S2 not included in S1, we obtain: • In this way we can distinguish, in the final estimated difference, two possible bias due to: • The source substitution effect for S1 = • Difference originated from the calibration procedure for S2 = Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (7) First meeting ESSnet on Data Integration Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (8) First meeting ESSnet on Data Integration Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (7) First meeting ESSnet on Data Integration CDF of weights w and w* Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (7) First meeting ESSnet on Data Integration Distribution of old and new weights Rome, January 29, 2010
First step in the reconstruction of the main economic variables for SME by using a Fiscal archive (8) First meeting ESSnet on Data Integration Bias and evaluation of errors on Y=Turnover After a simulation of 1000 re-sampling Rome, January 29, 2010
Looking ahead Heavily reduced missing response rates and reduced impact of the calibration procedure Analysis of two types of estimation bias for other variables:- Source substitution effect - Total non response effect Evaluation of the Absolute Relative Bias and the Root Mean Square Error of the two estimates Y and Y* Development of data imputation pattern to cover: - remaining variables not contained in administrative data (as PMR)- mismatches (as TMR) Restructuring SBS by integrating Administrative sources in the statistical production process Rome, January 29, 2010