350 likes | 499 Views
Housekeeping. Fire alarm: LOUD continuous ringing Turn right down corridor Down stairs Gather on Oxford Road side of building Men’s and Women’s toilets Turn right, toilets at end of corridor. Using the hierarchy of the government surveys. Jo Wathan Centre for Census and Survey Research
Housekeeping Fire alarm: LOUD continuous ringing Turn right down corridor Down stairs Gather on Oxford Road side of building Men’s and Women’s toilets Turn right, toilets at end of corridor
Using the hierarchy of the government surveys Jo Wathan Centre for Census and Survey Research Economic and Social Data Service (Government Data)
ESDS Government • Part of the wider Economic and Social Data Service, ESRC funded data dissemination and support service. • ESDS is headed by UK Data Archive, also involves MIMAS and CCSR at the University of Manchester and ISER at the University of Essex • ESDS Government, headed by CCSR. • Supports the large scale, continuous, cross-sectional surveys collected by ONS and NatCen • Data dissemination carried out by UKDA • Value added services and user support carried out by CCSR ESDS Using Hierarchy: v.06/04
This afternoon… • What is hierarchical data? • What is the research purpose of hierarchical data? • What hierarchy is available in ESDS Government datasets? • Working with hierarchy in SPSS and Stata • Practical exercise ESDS Using Hierarchy: v.06/04
What is hierarchy? • Data which can be analysed at more than one level, where smaller levels are nested within higher levels • Most commonly seen in the form of household data, where information is collected on all individuals within the household • Data contains a variable indicating which household an individual lives in • Data can be analysed at the household level or the individual level • Often possible to analyse at the family level too • Other forms of hierarchy available, eg. Sub-individual level (e.g. information per hospital stay, per crime reported) ESDS Using Hierarchy: v.06/04
Compared with flat files… • Contextual information may be present, e.g. individual asked about size of household but: • Information collected from only one level • Not usually appropriate to use data at other levels • Not usually possible to create additional derived variables at other levels • E.g. information collected from one individual within household ESDS Using Hierarchy: v.06/04
Hierarchical data: conceptually ESDS Using Hierarchy: v.06/04
More complex hierarchy… ESDS Using Hierarchy: v.06/04
What does the data look like?Flattened data (GHS) ESDS Using Hierarchy: v.06/04
What does the data look like (2)Multiple tables (FES) Household.por Jobmain.por ESDS Using Hierarchy: v.06/04
Use the hierarchy to… • Better describe the household • Describe the household context of an individual • Look at intra-household differences (& sameness) ESDS Using Hierarchy: v.06/04
Describing the household e.g. Is the household deprived / in poverty? • Equivalising income (e.g. FRS) • Need information on total income (all members not just Household Reference Person) • Need information on household composition • Identifying workless households • E.g. Gregg and Wadsworth (1999) ESDS Using Hierarchy: v.06/04
Source: Richard Dickens, Paul Gregg and Jonathan Wadsworth (2000) ‘New Labour and the Labour Market, CMPO Working Paper Series 00/19 Table 5 ESDS Using Hierarchy: v.06/04
The effect of partnership on employment (mothers) ESDS Using Hierarchy: v.06/04
Ethnic homogeneity -% hhold members in same ethnic group as HOH Source 1991 Household SAR ESDS Using Hierarchy: v.06/04
Hierarchy in some key datasets ESDS Using Hierarchy: v.06/04
Main Levels • Household • group who have the accommodation as their only or main residence and who either share one meal a day or share the living accomodation. • Useful for coresidence or policy related issues • Family Unit • An individual plus partner plus any unmarried children • The census definition of family unit excludes single childless individuals • Useful for identifying partnership and parenthood relationships • Benefit Unit • Adult children in separate unit from parents • Useful when considering income and benefits • Check your definitions (despite harmonisation) ESDS Using Hierarchy: v.06/04
Identifying the units • You will need a unique identifier for the unit at each level • Several variables may be needed to be used in combination • You may need to compute a unique identifier • Will need to read the documentation to assess this ESDS Using Hierarchy: v.06/04
Straightforward: GHS 00-01 • To identify a household use HSERIAL • To identify an individual within the household use PERSNO • To identify a family unit use FSERIAL • To identify a family unit within a household use AFAM • To identify the household reference person test for PERSNO = HRP (HRP gives the person no. for the HRP) • Similarly to locate the Family Unit head test for FUH=PERSNO ESDS Using Hierarchy: v.06/04
Complex e.g. QLFS 2003 • If interested in using household information use the Household File • Information about identifiers is in the read file • Household identifier is Remserno – however this is not present in all LFS datasets • To compute use: • Week x 10000000 + • W1yr x 1000000 + • Qrtr x 100000 + • Add x 1000 + • Wafnd x 100 + • Hhd • This has to be used together with either CASEID or QUOTA (which are identical) – could combine this with Remserno to derive an easier to use household ID • To identify a person in the household use person ESDS Using Hierarchy: v.06/04
Working with hierarchical data • Which level should I analyse at? • Manipulating data in SPSS • Menu driven approach • Syntax • Manipulating data in Stata ESDS Using Hierarchy: v.06/04
Which level should I analyse at? ESDS Using Hierarchy: v.06/04
Understanding the data • What is the default case/unit of analysis in the dataset? • How many cases are in the data? • How many households are in the data? • How many family units are in the data? • How many households have more than one family unit? • How large is the largest household? • How many lone families are in the data? ESDS Using Hierarchy: v.06/04
Using the data • What unit of analysis would you use to answer the following questions? • Would you need create variables at different levels of analysis to answer the question? • What is the mean income per adult? • What proportion of children live with 2 parents? • What is the mean income per adult-equivalent household member (where children count as half a household member)? • Does your partner’s health affect your own? • How is total household income related to tenure? ESDS Using Hierarchy: v.06/04
Working with hierarchy in SPSS • SPSS is not good at data manipulation! • To generate a household variable from individual data need to use the aggregate command. • Aggregate command creates a household level file, with: • 1 case per household • Contains the household ID variable specified plus any aggregate variables defined • Slow, memory intensive, unnecessarily complicated compared with some other packages… ESDS Using Hierarchy: v.06/04
Aggregation at the household level • You can work at the level of the household • Use the aggregate outfile • Remember to carry across other household level variables that you will need into the aggregate file as part of the aggregate procedure • Or match the household level variable back to the original individual level dataset… ESDS Using Hierarchy: v.06/04
Aggregate and match back to individual file • Usually it is best to match back your aggregated variable to the master file • the household variable is distributed to each individual • you can then select on household head, family head to work at level of household or family • Or you can link information about the household to the individual ESDS Using Hierarchy: v.06/04
SPSS syntax used *compute a variable which is a low value, but which takes the (higher) value for health when respondent is hrp. compute hlthrep = -9. if (reltohrp = 1) hlthrep = health. crosstabs hlthrep by health by reltohrp. sort cases hid. aggregate outfile = "c:\work\esds\aggfile.sav" /break hid /nperhh = n(hid) /oldest = max(age) /hrphlth = max(hlthrep). execute. match files /file = * /table = "c:\work\esds\aggfile.sav" /by hid. execute. ESDS Using Hierarchy: v.06/04
Working with hierarchy in Stata • Stata much better at data manipulation than SPSS • Not necessary to create an additional file • Simply run the appropriate procedure for each household separately • Sort the data by the household identifier first • Use the by household identifier subcommand ESDS Using Hierarchy: v.06/04
The equivalent Stata commands: sort hid egen nperhh = count(hid), by (hid) egen oldest = max(age), by (hid) gen hlthrep = -9 replace hlthrep=health if (reltohrp == 1) egen hrphlth = max(hlthrep), by (hid) ESDS Using Hierarchy: v.06/04
Some issues… • Is the data representative for your choice of unit? • Looking at individuals in a household survey will generally omit individuals not living in households • Weighting may be necessary to counteract survey design • If the survey was not designed to analyse using the units you use, will it still be representative? • Will there be any clustering effects? • Individuals within households will be more alike than individuals in general • This could affect the accuracy of the estimates ESDS Using Hierarchy: v.06/04