190 likes | 209 Views
Explore SELEKT framework, editing principles, parameter specifications, record flagging, glossary, and best practices for cross-sectional business surveys.
E N D
The Edit Anders Norberg, Statistics Sweden (SCB) Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011
The environment of SELEKT Input, throughput, output, use Suspicion
SELEKT 1.1 Raw+edited past (cold) survey data SAS data set Input (hot) survey data Survey specific cold adapter (SAS code) Data preparation Edits Survey specific hot adapter (SAS code) Data preparation PRE-SELEKT Parameter specifications, Analysis of cold data Table of Parameters SNOWDON-X analysis of edits SAS data set AUTOSELEKT Score calculation & record flagging Table of Estimates CLAN estimation software Accepted records Records to FOLLOW-UP Records to IMPUTATION Process data and reports
Glossary of Terms on Statistical Data Editing (1) “EDIT RULE SPECIFICATION CHECK RULE SPECIFICATION A set of check rules that should be applied in the given editing task.”
Glossary of Terms on Statistical Data Editing (2) “CHECKING RULE A logical condition or a restriction to the value of a data item or a data group which must be met if the data is to be considered correct. In various connections other terms are used, e.g. edit rule.”
Recommended Practices for Editing and Imputation in Cross-sectional Business Surveys “EDIT A logical condition or a restriction to the value of a data item or a data group which must be met if the data is to be considered correct. Also known as edit rule or checking rule.”
Example 1 if Occupation = ‘Doctor’ and not (29000 < Salary < 71000) then Errcode_A01 = ‘Flag’
Example 1 The test variable if Occupation = ‘Doctor’ and not (29000 < Salary < 71000) then Errcode_A01 = ‘Flag’
Example 1 The edit group if Occupation = ‘Doctor’ and not (29000 < Salary < 71000) then Errcode_A01 = ‘Flag’
Example 1 The acceptance region if Occupation = ‘Doctor’ and not (29000 < Salary < 71000) then Errcode_A01 = ‘Flag’
Example 2 The test variable if Occupation = ‘Doctor’ and not (29000 < Salary < 71000)or Occupation = ‘Nurse’ and not (23300 < Salary < 43800)then Errcode_A02 = ‘Flag’
Example 2 The edit groups if Occupation = ‘Doctor’ and not (29000 < Salary < 71000)or Occupation = ‘Nurse’ and not (23300 < Salary < 43800)then Errcode_A02 = ‘Flag’
Example 2 The acceptance regions if Occupation = ‘Doctor’ and not (29000 < Salary < 71000)or Occupation = ‘Nurse’ and not (23300 < Salary < 43800)then Errcode_A02 = ‘Flag’
Edits EDITEdit identification Type of edit Active Section Internal error message External error message Instruction for data review Un-edited test variable Error flag EDIT GROUP AND ACCEPTANCE REGION Edit identification Edit groupAcceptance region
Edits EDITEdit identification Type of edit Active Section Internal error message External error message Instruction for data review Un-edited test variable Error flag EDIT GROUP AND ACCEPTANCE REGION Edit identification Edit groupAcceptance region 1 EDIT PRACTICAL SUPPORTEdit identification Standard edit rule Edited test variable Suspicion probability value produced by the SELEKT system 2 3 LINKEdit identification Survey variable IMPACT ON STATISTICS Survey variable Potent. impact on statistics 5 4 FLAGGING EDITS, VARIABLES AND UNITS
My questions (1) • Can most edits be described as consisting of the components • test variable • edit group • acceptance region ? • What types of edits can not?
My questions (2) If the edits can be described this way, what arguments are there for saying that • one edit has only one edit group and one acceptance region • one edit can be composed of many edit groups with one acceptance region each?
My questions (3) Can you give me examples of • similar modeling of edits • metadata storage for edits • edit script generator using a standard metadata storage for edits