100 likes | 110 Views
Learn about the Norwegian CPI data validation and editing process, including techniques for survey systems and special surveys. Understand the steps involved in ensuring accurate and reliable data capture.
E N D
The Norwegian CPIData Validation and Editing 8-9 May 2008 Tom Langer, Statistics Norway
Survey systems – data capture • The CPI survey systems • The regular surveys (40 pct) • The special surveys (60 pct) • Regular surveys • Some 2000 outlets every month – 39 000 observations • No price collectors involved in data capture • Qualitative information added by respondent • Internet system ; 20 pct of respondents • Postal survey – questionnaire ; 80 pct
Regular survey - validation • Step1 Initial cleaning • Likely decimal errors and key punch errors • Check against the questionnaire (electronic) • Step 2 Automatic flagging of observations • HB method combined with a normalised test • Decision criteria: An observation for further inspections should be flagged in both methods
HB method set up • Basic test level: Regional product group (8 regions) • Fairly homogenous • Sufficient number of observations for robust estimation of median, quartiles • In some cases the number is too low • In case – system expands data set to cover all observations on national product level. • Test variables:T1 = pt / pt-1T2 = pt / pJuly
Validation set up Transformation of the price relative distributions - in 2 steps: 1: Distributions symmetric around the median relative price 2: Allow for the influence of price levels U = 0,5 • Leads to the effect distributions for T1 and T2 Accept intervals according to HB method: • Lower Level = Em – C max (Em -Eq1;A Em) • Upper Level = Em + C max (Eq3- Em;A Em)
Editing Data received are edited in several steps • Initial cleaning of data • A second round based flagged extremes • Treatment of non response – automatic imputation Macro controls • Product level – region (8 regions) • COICOP level • A final impact control – top-down principle
Special surveys – data capture • Cover 60 pct of the total CPI weight • Respondent burden • Respondents have well developed computer based systemsand are positive to share data • Surveys based on scanner data ; 30 pct of CPI weight • Food and beverages (300 000 obs) • Alcoholic beverages (14 000 obs) • New cars (1 750 obs) • Other surveys – administrative data