490 likes | 580 Views
Computer Edit Specifications for Pilot Census 2001 Data Processing Project. Christopher S. Corlett Data Processing Adviser U.S. Census Bureau. Editing examples:. Young heads of household Population group Access to telephones Same-sex marriages Fertility
E N D
Computer Edit Specificationsfor Pilot Census 2001Data Processing Project Christopher S. Corlett Data Processing Adviser U.S. Census Bureau
Editing examples: • Young heads of household • Population group • Access to telephones • Same-sex marriages • Fertility Source for all data: Pilot Census 2001
Editing examples: • Young heads of household • Population group • Access to telephones • Same-sex marriages • Fertility Source for all data: Pilot Census 2001
Young heads of household • V.3 (relationship for head) and V.5 (age of head) • Related issue: each HH must have 1 and only 1 head. • For invalid head of ages, try to obtain via: • spouse (impute from deck based on spouse's age and head's sex) • otherwise, children (child's age and head's sex) • otherwise, impute from deck (household sizeand head's sex)
Young heads • Skepticism about young heads; if younger than 12 then confirm: • if someone else older is present, then make them the head (V.3) • can't be married (must be 12+ years to be married) • has to be 12 years older than biological children • confirm consistency of age and educational level • confirm consistency of age and educational institution • can't have economic activity responses if younger than 10 • can't have fertility (for girls) • If head doesn't pass these age tests, then impute (based on head’s sex and household size).
Young heads • Effect: number of heads younger than 12 years old drops from 1296 (1.3%) to 627 (0.6%)
Case 1: Notes: PN = Person number SEX = Sex DOB = Day of birth MOB = Month of birth YOB = Year of birth REL = Relationship to head MAR = Marital status SPN = Spouse person number CEB = Children ever born (total) CS = Children surviving (total) MPN = Mother person number FPN = Father person number
Case 1: PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 1 11 01 1950 051 01 1 99 99 01 02 1 17 07 1977 023 03 5 01 03 2 04 04 1985 005 03 5 00 09 01 04 1 24 10 1987 011 03 5 53 01 05 1 01 07 1990 010 03 5 49 01 06 1 20 02 1994 007 01 5 99 01 07 1 20 02 1994 007 5 99 01 V.2b4b: age and DOB inconsistent, age <= DOB, Age = 005 Date = 04/04/1985 V.2b4b: age and DOB inconsistent, age <= DOB, Age = 011 Date = 24/10/1987 V.3: either no heads or > 1= 0002 V.3h: more than 1 head = V.3i: multiple heads, making oldest= 0051 V.3k: multiple heads, making excess other rel V.9g: Relation invalid, has a dad, impute Rela PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 1 11 01 1950 051 01 1 99 99 01 02 1 17 07 1977 023 03 5 01 03 2 04 04 1985 015 03 5 00 09 01 04 1 24 10 1987 013 03 5 53 01 05 1 01 07 1990 010 03 5 49 01 06 1 20 02 1994 007 11 5 99 01 07 1 20 02 1994 007 03 5 99 01
Case 2: PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 2 01 09 1986 015 01 5 00 90 02 2 09 06 1990 011 06 5 03 1 01 09 1991 010 06 5 99 04 2 01 09 1994 007 06 5 99 V.2b4b: age and DOB inconsistent, age <= DOB, Age = 015 Date = 01/09/1986 V.2b4b: age and DOB inconsistent, age <= DOB, Age = 011 Date = 09/06/1990 V.2b4b: age and DOB inconsistent, age <= DOB, Age = 010 Date = 01/09/1991 V.2b4b: age and DOB inconsistent, age <= DOB, Age = 007 Date = 01/09/1994 V.3a1: head is younger than 16, Age = 014 V.3a3: no older relatives found; keep young head PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 2 01 09 1986 014 01 5 00 90 02 2 09 06 1990 010 06 5 03 1 01 09 1991 009 06 5 99 04 2 01 09 1994 006 06 5 99
Case 3: PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 1 12 01 1998 003 09 05 02 2 008 09 05 V.3b: no head of household! V.3e: no head, making oldest person the head V.5: head is younger than 12, about to confirm this V.5e1: young head, but age consistent with educ lvl V.5i1: young head, but age consistent with educ inst V.5k: imputing young head's age from AHEADAGE for econ activity inconsistency PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 1 99 99 1908 092 01 05 02 2 12 01 1998 003 09 05
Editing examples: • Young heads of household • Population group • Access to telephones • Same-sex marriages • Fertility
Population Group (V.13) • For invalid population group, try to obtain via: • Head of household • Someone else in the household • Otherwise, impute from deck (age by household size) • Effects: • Removes 2.9% blank/invalid responses
Population Group • Parts of the current edit might need refinement for South Africa • Issues to explore: • Imputations in HHs with multiple pop groups; • Tolerances and household size: • Case where whole HH has blank/invalid pop group; • Case where all but 1 HH member has same pop group; • Situations between these two extremes • Effect on planning/data use of leaving the variable “not stated”
Case 1: PN SEX AGE REL GRP LAN RGN RSA PRV CNT CTZ URS PERMPLAC SM 01 1 073 01 1 06 55 1 09 1 1 02 2 063 02 1 06 55 1 09 1 1 03 2 025 11 1 06 55 1 09 1 1 04 1 016 09 1 06 55 1 09 1 1 05 1 014 09 1 06 55 1 09 1 1 06 2 011 09 1 06 55 1 09 1 1 07 2 000 11 1 09 1 1 V.13e: Pop group invalid, impute from head PN=07 Group=Head Group= 1 PN SEX AGE REL GRP LAN RGN RSA PRV CNT CTZ URS PERMPLAC SM 01 1 073 01 1 06 55 1 09 1 1 02 2 063 02 1 06 55 1 09 1 1 03 2 025 11 1 06 55 1 09 1 1 04 1 016 09 1 06 55 1 09 1 1 05 1 014 09 1 06 55 1 09 1 1 06 2 011 09 1 06 55 1 09 1 1 07 2 000 11 1 06 55 1 09 1 1
Case 2: PN SEX AGE REL GRP LAN RGN RSA PRV CNT CTZ URS PERMPLAC SM 01 1 032 01 3 02 39 1 08 710 1 1 02 2 028 02 1 08 1 1 03 1 068 07 1 08 1 1 04 2 057 07 1 08 1 1 05 2 007 03 1 06 1 1 06 1 006 03 1 08 1 1 07 1 001 03 1 08 1 1 08 2 030 12 1 07 09 1 1 V.13e: Pop group invalid, impute from head (SIX TIMES) PN SEX AGE REL GRP LAN RGN RSA PRV CNT CTZ URS PERMPLAC SM 01 1 032 01 3 02 39 1 08 1 1 02 2 028 02 3 02 39 1 08 1 1 03 1 068 07 3 02 39 1 08 1 1 04 2 057 07 3 02 39 1 08 1 1 05 2 007 03 3 02 39 1 06 1 1 06 1 006 03 3 02 39 1 08 1 1 07 1 001 03 3 02 39 1 08 1 1 08 2 030 12 1 07 39 1 09 1 1
Case 3: PN SEX AGE REL GRP LAN RGN RSA PRV CNT CTZ URS PERMPLAC SM 01 1 045 01 01 32 1 01 1 02 2 048 02 01 32 1 01 1 V.13b: Pop group invalid, impute from deck V.13e: Pop group invalid, impute from head PN SEX AGE REL GRP LAN RGN RSA PRV CNT CTZ URS PERMPLAC SM 01 1 045 01 4 01 32 1 01 1 1 02 2 048 02 4 01 32 1 01 1 1
Editing examples: • Young heads of household • Population group • Access to telephones • Same-sex marriages • Fertility
Telephones and cell phones (IV.16) • Telephone access is not applicable for households that have telephones or cell phones. • Households with responses to the telephone access question should not have telephones or cell phones. • Impute these variables from hot decks (based on dwelling type and tenure status) if necessary.
Telephones and cell phones • Many left all questions blank • Problems with capture of continuation qsts • Confusion of “blank” and “no” (also seen in disabilities section)
Case 1: DWL MLT RMS SHR TEN WAT SRC TLT COK HET LIT RAD TV CMP FRG TEL CLL ACC RFS 01 2 006 1 4 7 4 1 5 1 1 1 2 1 2 2 4 IV.16c: impute cell phone = no Phone2 Cell= Access= 2 DWL MLT RMS SHR TEN WAT SRC TLT COK HET LIT RAD TV CMP FRG TEL CLL ACC RFS 01 2 006 1 4 7 4 1 5 1 1 1 2 1 2 2 2 4
Case 2: DWL MLT RMS SHR TEN WAT SRC TLT COK HET LIT RAD TV CMP FRG TEL CLL ACC RFS 01 006 1 4 1 4 1 1 1 1 1 2 1 1 4 IV.16h: imputed cell = 1 from deck DWL MLT RMS SHR TEN WAT SRC TLT COK HET LIT RAD TV CMP FRG TEL CLL ACC RFS 01 2 006 1 4 1 4 1 1 1 1 1 2 1 1 1 4
Case 3: DWL MLT RMS SHR TEN WAT SRC TLT COK HET LIT RAD TV CMP FRG TEL CLL ACC RFS 02 2 005 1 4 1 4 4 4 1 1 IV.13c: imputed television IV.14c: imputed computer IV.15c: imputed refrigerator IV.16f: imputed telephone IV.16h: imputed cell IV.16j: imputed access IV.17c: imputed rubbish DWL MLT RMS SHR TEN WAT SRC TLT COK HET LIT RAD TV CMP FRG TEL CLL ACC RFS 02 2 005 1 4 1 4 4 4 1 1 1 2 2 2 2 1 4
Editing examples: • Young heads of household • Population group • Access to telephones • Same-sex marriages • Fertility
Same-sex marriages (V.7, V.8, and V.12) • Treated as part of the marital status edits for heads and rest of household • Imputations for invalid sex never result in a same-sex marriage • No polygamous combinations of same-sex allowed
Same-sex marriages • Skepticism about same-sex marriages; only allowable if: • Both partners 12 years or older; • Both sexes valid; • Relationships to head consistent (for sub-families); • Both partners’ marital statuses reported as “living together” (4).
Same-sex marriages • Investigation shows that almost all of the reported same-sex marriages are erroneous. • Enumerator’s manual contains instructions that add bias against accurate collection. • Social situation in SA means that this might become a contentious issue.
Same-sex marriages • Enumerator’s Manual, pg 38: “Question P-05: Marital Status … Couples who are not married to each other but live together as if they are married, belong to category 4. This category is for people who live in every respect as a married couple except that they have not undergone a marriage ceremony. Only male/female couples should indicate this category – the census does not collect data on gay couples.”
Case 1: PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 1 28 12 1930 071 01 1 02 02 1 17 06 1937 064 02 1 01 03 1 06 02 1935 066 06 8 04 2 06 03 1984 007 09 5 00 99 V.2b4b: age and DOB inconsistent, age <= DOB,Age=071 Date=28/12/1930 V.2b4b: age and DOB inconsistent, age <= DOB,Age=064 Date=17/06/1937 V.2b4b: age and DOB inconsistent, age <= DOB,Age=007 Date=06/03/1984 V.7i: same sex marriage w/ MSs not both 4 PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 1 28 12 1930 070 01 1 02 02 2 17 06 1937 063 02 1 01 03 1 06 02 1935 066 06 8 04 2 06 03 1984 016 09 5 00 99
Case 2: PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 2 16 01 1956 044 01 5 01 01 02 2 09 05 1991 009 02 5 V.2b4b: age and DOB inconsistent, age <= DOB,Age=044 Date=16/01/1956 V.7a: imputing SPN for head to point to spouse SPN=Spouse= 0002 V.7e: imputing head MS from female head MS= 5 SPN= 02 V.7g: spouse too young ... impute from age Head Age = 045 Sp Age= 009 V.7i: same sex marriage w/ MSs not both 4 V.7m: imputing sp MS from hot deck V.7n: making spouse SPN point to head PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 2 16 01 1956 045 01 1 02 01 01 02 1 09 05 1991 026 02 1 01
Case 3: PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 2 03 03 1976 025 01 4 01 01 99 99 02 2 14 08 1979 021 02 4 99 03 1 03 08 1995 005 03 5 02 01 V.2b4b: age and DOB inconsistent, age <= DOB, Age=025 Date=03/03/1976 V.7a: imputing SPN for head to point to spouse V.7h: same sex marriage, both head & spouse MS = 4 V.7n: making spouse SPN point to head PN SEX DOB MOB YOB AGE REL MAR SPN CEB CS MPN FPN 01 2 03 03 1976 024 01 4 02 01 01 99 99 02 2 14 08 1979 021 02 4 01 99 03 1 03 08 1995 005 03 5 02 01
Editing examples: • Young heads of household • Population group • Access to telephones • Same-sex marriages • Fertility
Fertility (V.27) • Fertility is not applicable for men or women not 12:49 years old. • For women 12:49, blanks in fertility section are treated as zeros. • Handle common enumerator and reporting errors • Switch lines when turning to next page; • Husband report fertility, not wife; • Last child info with child, not mother.
Notes: TCEB = Total children ever born MCEB = Male children ever born FCEB = Female children ever born TCS = Total children surviving MCS = Male children surviving FCS = Female children surviving SXLAST = Sex of last child born VSLAST = Vital status of last child born (still alive?) YRLAST = Year of birth of last child born MOLAST = Month of birth of last child born
Fertility • Fertility is valid if all of the following are true: • TCEB = MCEB + FCEB, and • TCS = MCS + FCS, and • TCEB >= TCS, and • MCEB >= MCS, FCEB >= FCS, and • number of boys in the household who declared this person as their mother (using mother person number) ≤ MCS, and • number of girls in the household who declared this person as their mother (using mother person number) ≤ FCS, and • and woman's age ≥ (11 + TCEB), and • FCEB>0 if SXLAST=female, and • MCEB>0 if SXLAST=male, and • FCS>0 if SXLAST=female and VSLAST=alive, and • MCS>0 if SXLAST=male and VSLAST=alive, and • all responses for last child born information (YRLAST, MOLAST, SXLAST, VSLAST) are complete and valid, or else they are all blank (indicating no births);
Fertility • Also, maximum number of children (24 total and 12 per sex). • When bad CEB or CS values can be calculated, then we do that. • When fertility is not valid, impute a consistent set of fertility responses from a deck (based on age, marital status, education level); then confirmlast child born info from woman’s children in household.
Case 1: PN SEX AGE CEB MCB FCB CS MCS FCS MLB YRLB SLB VLB 01 1 041 02 2 038 04 02 02 04 02 02 08 1991 2 1 03 2 022 71 01 00 01 01 00 06 1999 1 1 04 1 012 05 2 009 06 1 001 V.27: problems detected in fertility info ... PN= 03 V.27b: imputing TCEB = MCEB+FCEB PN= 03 TCEB=71 MCEB=01 FCEB=00 PN SEX AGE CEB MCB FCB CS MCS FCS MLB YRLB SLB VLB 01 1 041 02 2 038 04 02 02 04 02 02 08 1991 2 1 03 2 022 01 01 00 01 01 00 06 1999 1 1 04 1 012 05 2 009 06 1 001
Case 2: PN SEX AGE CEB MCB FCB CS MCS FCS MLB YRLB SLB VLB 01 2 054 02 2 035 02 01 01 02 01 01 2 1 03 2 020 00 04 2 014 00 05 1 012 06 2 005 V.27: problems detected in fertility info ... PN=02 V.27POST: LAST info blank, imputing from youngest child PN= 02 (updates FCEB, TCEB, FCS, TCS) V.27e: imputing fertility data from AFERTILITY PN=03 V.27e: imputing fertility data from AFERTILITY PN=04 PN SEX AGE CEB MCB FCB CS MCS FCS MLB YRLB SLB VLB 01 2 054 02 2 035 02 01 01 02 01 01 11 1995 2 1 03 2 020 00 00 00 00 00 00 04 2 014 00 00 00 00 00 00 05 1 012 06 2 005
Case 3: PN SEX AGE CEB MCB FCB CS MCS FCS MLB YRLB SLB VLB 01 2 042 05 02 03 05 02 03 09 1997 2 1 02 2 021 01 01 01 01 04 1994 1 1 03 2 018 01 00 00 01 00 00 01 2001 1 04 1 020 05 1 014 06 2 003 07 1 002 08 2 000 V.27: problems detected in fertility info ... PN= 02 V.27c: imputing FCEB = TCEB-MCEB PN= 02 V.27g: imputing FCS = TCS-MCS PN= 02 V.27: problems detected in fertility info ... PN= 03 V.27b: imputing TCEB = MCEB+FCEB PN= 03 V.27j: imputing fertility from hot deck PN= 03 PN SEX AGE CEB MCB FCB CS MCS FCS MLB YRLB SLB VLB 01 2 042 05 02 03 05 02 03 09 1997 2 1 02 2 021 01 01 00 01 01 00 04 1998 1 1 03 2 018 01 00 01 01 00 01 01 2001 2 1 04 1 020 05 1 014 06 2 003 07 1 002 08 2 000
Fertility • Issues: • If woman reports zero TCEB and leaves rest blank, does that mean “no fertility” or “error”? • See if last child born can be handled separately from rest of fertility, so that full set is not imputed when last child born has problems and rest is valid