170 likes | 259 Views
The Future of Administrative Data ICES III End Panel Discussion. Don Royce Statistics Canada June 2007. Outline. Basic issues in the use of tax data A short history of the use of tax data in business surveys at Statistics Canada Future challenges. Uses of tax data (Brackstone 1987).
E N D
The Future of Administrative Data ICES III End Panel Discussion Don Royce Statistics Canada June 2007
Outline • Basic issues in the use of tax data • A short history of the use of tax data in business surveys at Statistics Canada • Future challenges
Uses of tax data (Brackstone 1987) • Direct tabulation • Indirect estimation • Survey frames and sample design • Survey evaluation
Issues in using tax data (ibid) • Coverage • Content • Concepts/definitions • Small domain estimates/sampling • Quality control • Cost • Frequency • Timeliness • Stability • Respondent burden
A short history of tax data usage at Statistics Canada • Pre-1985: • Tax data used to build sample frames (no centralized Business Register) • Trade statistics based on administrative data • Tax data program based on capturing two-phase samples of paper forms • 1985 - 1988: • The Business Survey Redesign Project (BSRP) • Creation of centralized Business Register (BR) based on profiling, payroll deduction data and annual income tax data • Redesign of annual and sub-annual surveys • Enhanced use of tax data by survey programs – but still based on paper forms
A short history (cont’d) • 1991 - 1993: Two new data sources • Federal government introduced the Goods and Services Tax (GST) - a form of VAT • Two new variables available from payroll data -facilitated the redesign of the monthly Survey of Employment, Payrolls and Hours • 1997 – the Big Bang: • Project to Improve Provincial Economic Statistics (PIPES) to allocate new Harmonized Sales Tax • Infusion of $42M in new funds • Provincial dimension implied additional response burden, so enhanced use of tax data became crucial
A short history (cont’d) • 1997 – the Big Bang (cont’d): • Introduction of a single Business Number (BN) by Canada Revenue Agency • Introduction of mandatory filing in a standard format (GIFI) for corporations • Introduction of North American Industry Classification System (NAICS) • Conversion of the BR to a BN basis, coded to NAICS • Creation of a new Tax Data Division (TDD) • Completion of many of the goals of the BSRP
Today: Infrastructure • Most business surveys use the Business Register as their frame • Unified Enterprise Survey collects annual statistics • Common “Chart of Accounts” connecting annual income tax data, survey data and SNA concepts • A wide range of data (GST, payroll deduction, annual tax data) is available in (mostly) electronic form and linkable via the BN • Tax data are obtained and processed by TDD on an increasingly timely basis
Today: Uses • GST data replace significant portions of existing sub-annual samples for simple units (with ratio adjustment for delayed reporting) • Annual tax data replace significant portions of annual samples of simple units (straight substitution) • Payroll deduction data in combination with survey data produce estimates of average weekly earnings, etc. • Tax data are used exclusively for very small “Take-None” units • Tax data are used for editing and imputation of survey data • Extensive use of tax data by the System of National Accounts (e.g., benchmarking of labour income)
Future challenges • Moving from retrofitting existing samples to an integrated survey/tax data design • Increasing the use of tax data for complex businesses • Use of tax data for specialized populations • Measuring data quality • Modeling/imputation – how much is too much?
Moving to an integrated survey / tax data design • Current approach (mostly) is to replace survey data with tax data for existing samples of simple units • Can we use tax data for the entire universe, in combination with a small sample survey to adjust for differences? • Much more data to process, change of culture for subject matter analysts
Tax data for complex businesses • Complex structures must often be broken down into their components, e.g., industry, province • Tax data are often only available for legal entities, not operational units • How well can we allocate tax data to industry and province levels?
Specialized populations • Canada Revenue Agency creates numerous files which we have not yet fully explored, for example: • Files covering the non-profit sector • Files covering trusts • Audit files, which may be useful for measuring the underground economy • Work is just beginning on this
Measuring Quality • Need to recognize the particular characteristics of administrative data, for example: • Methods for calculating response rates for surveys combining survey and tax data • Methods for measuring the variance due to imputation • Methods that take into account the use of modeling • Difficult or impossible to follow up respondents to validate administrative data
Modeling and imputation - how much is too much? • Modeling is used in calendarization, estimation, imputation, allocation of tax data to adjust for various deficiencies (e.g., timeliness, coverage issues) • We often conduct studies, conclude that the results are not much different, then assume that modeling is okay – different paradigm than survey sampling • We need to make sure that assumptions are explicit, the models are robust, and that they are checked on a regular basis
Conclusion • Tremendous progress in using tax data in business surveys in the past 20 years • The need to use tax data will increase as pressures to reduce costs and response burden continue • Many exciting methodological challenges ahead