250 likes | 370 Views
Improving Quality in the Office for National Statistics’ Annual Earnings Statistics. Pete Brodie & Kevin Moore UK Office for National Statistics. Outline. What is being measured? Background Improvements introduced for the Annual Survey of Hours and Eanings (ASHE) weighting
E N D
Improving Quality in the Office for National Statistics’ Annual Earnings Statistics Pete Brodie & Kevin Moore UK Office for National Statistics
Outline • What is being measured? • Background • Improvements introduced for the Annual Survey of Hours and Eanings (ASHE) • weighting • output statistic • coverage • variance estimation • Future • improved sample design • Conclusions
What is being measured? • The Annual Survey of Hours and Earnings (ASHE) is the main vehicle for measuring wage levels and working hours in the UK • For government taxation and wage policy • Measuring the effect of the National Minimum Wage • Measuring gender pay differences • Measuring compliance with maximum hours directives • For compensation cases to estimate loss of earnings or future care costs • For regional and local planning purposes • Also used for pay bargaining
Background • Formerly the “New Earnings Survey” (NES) • Survey largely unchanged since the early 1970s • Receive a 1 in 100 sample from the tax office • All employees who have one or more jobs as part of a Pay as You Earn (PAYE) scheme • No weighting carried out • Only crude sample variance estimates produced
Improvements introduced for ASHE • The Annual Survey of Hours and Earnings (ASHE) recently (2004) replaced the NES • Changes introduced • weighting • outputs • coverage • variance estimation
Weighting (1/2) • The Labour Force Survey (LFS) which is a household survey also measures the labour market in the UK • The LFS is calibrated to the mid year population estimates • Since wages, hours and response rates are highly correlated with demographic factors we calibrate to LFS outputs
Weighting (2/2) • Analysis determined which factors were most associated with key ASHE variables. • The final factors to be included in the model were (most significant first): • Main Occupational Category 9 • Sex 2 • Age (less than 25, 25-49 and 50+) 3 • Region (London&South East, remainder) 2 • Giving a total of 108 cross groups
Outputs (1/2) • The NES output focussed on means • We are actually interested in distributions • The ASHE output focusses on medians and includes ten other percentiles outputs • Means are also published • Also publish year on year change for every variable • Every output has a sampling variance estimate published
Outputs (2/2) Number of low paid 148,605 c.v=8.74% average wage £387.19 c.v.=0.37% lower decile £80.80 c.v.=1.50% median £317.64 c.v.=0.36% upper decile £713.96 c.v.=0.50% • Average pay of females in Wales £257.18 c.v.=1.43%
Coverage (1/3) • Initial sample drawn in January • Questionnaires sent out April • Use responses to this first questionnaire and updated admin data for a second phase • those who have changed employer (the movers) • those who have recently joined a PAYE scheme (joiners) • To compensate for the difference in coverage of the LFS and ASHE we also took a sample of companies outside the PAYE scheme
Coverage (2/3) PAYE employees Stayers LFS Movers Joiners Employees of VAT only Companies
Coverage (3/3) • First phase ≈ 250,000 employees • returned details ≈ 160,000 employees • number of employees no longer with the same employer but still working ≈ 26,000
Variance estimation (1/3) • We use GES software to calculate simple outputs with variance estimates • We treat our calibration totals as fixed (this underestimates the variance slightly) • For the percentile outputs we use indicator variables to estimate approximate variances
Variance estimation (3/3) • For year on year changes we use a repeated sampling method • Have to be careful when sampling • year one only • both years • year two only
Future (1/7) • Sample design is unchanged and so still quite inefficient • No auxiliary information used • Simple random sample everywhere • Looked at options for using extra information • Information about the rest of the frame • Additional auxiliary information • Options for sub-sampling
Future (2/7) • Currently have a simple Bernoulli 1% sample Details of their current employer only • We have additional information from our own business register (the IDBR) which holds details of size and industry of employing business • Sample variance of returned values correlated with the industry • Too much sample in some industries and too little in others!
Future (3/7) • Easy to reduce sample sizes • Looked at the effect on the overall variance when we removed sample from the “good” industries • Stratified the returned sample by industry and removed sample from the “worst” industry then the second worst etc. until full reduction achieved • Could impose restrictions too • Compared with removal at random
Future (5/7) • Considered increasing sample in some industries • Postulated that we start with a 2% sample of admin data
Future (7/7) • There is the possibility of getting auxiliary information • One of the opportunities arising out of Independence for National Statistics is more sharing of Administration data within government • There may be a suitable variable
Conclusions • Substantial improvements have been made to UK earnings statistics • Efficiency savings could be made by substantially cutting costs with little loss in quality • There is some scope for improving quality of high level outputs while reducing sample sizes • There is a possibility of making vast improvements with access to more detailed administrative data (Independence might bring this)
Questionnaire Issues • Talk by Jacqui Jones of the ONS • Improved Questionnaire Design yields better data: Experiences from the UK ASHE • Tomorrow morning (Wednesday) Session 36: A Global Path to Standards in Questionnaire Design
Any Question? Contact details: Pete.Brodie@ons.gsi.gov.uk