200 likes | 316 Views
Introducing the WageIndicator dataset. Kea Tijdens University of Amsterdam Amsterdam Institute for Advanced Labour Studies (AIAS) 15 April 2008, Bussum, NL. What is the web-survey?. WageIndicator Web-survey on work and wages
E N D
Introducingthe WageIndicator dataset Kea Tijdens University of Amsterdam Amsterdam Institute for Advanced Labour Studies (AIAS) 15 April 2008, Bussum, NL
What is the web-survey? • WageIndicator Web-survey on work and wages • the web-visitors in all WageIndicator countries are asked to complete the web-survey • for most of the questions similar across countries • Questionnaire topics • occupation, education, industry • workplace characteristics, firmsize, MNE, working conditions • employment history, job future • working hours, wages, benefits • personal questions • factual and attitudinal questions • Data of the web-survey • used for the Salary Check • used for research
Data of the web survey • Data releases • survey data quarterly released in SPSS format (or STATA) • each country receives a download email • Large sample sizes: 2007 • Argentina 12,000 Brazil 18,000 • Belgium 9,000 Denmark 1,000 • Finland 6,000 Germany 29,000 • India 2,000 Italy 300 • Hungary 1,500 Mexico 5,000 • Netherlands 40,000 Poland 5,000 • Russian Fed. 5,000 South Africa 2,000 • Spain 8,000 UK 8,000 • US 1,500
Representative data - NO • Disadvantage • volunteer survey, thus not a representative sample, though the higher Internet access rates, the more likely the Internet population reflects the national population • Advantage • comparable across countries • detailed information elsewhere not available • Weights • only available for a few countries • and for past years
Data cleaning • Per release • incomplete is not included • if missing on all six critical variables data is not included • if two observations have identical scores data is not included • if out of range or unreliable, wage data is set to missing value • Per annual release • string variables are cleaned • Check the variable RELEASE
Meta data • locale (per country and language) • en_IN en_US es_US • COUNTRY (ISO code) • code label • 32 Argentina • 276 Germany • date & time of survey • SURVEDATDate of survey (Date format) • SURVEWWWeek of survey (Num) • SURVEYYYear of survey (Num) • survetbTime of survey (begin) (Time format) • surveteTime of survey (end) (Time format) • case numbers • idkey (unique, automatically assigned) • IDNR (unique, refering to surveynumber + followup nr)
The variables • the number of variables • more than 900 variables in the worldwide dataset • more than 500 variables in the country datasets • the variable format • numeric variables – almost all variables • a few string variables -> occtext, sectext, wagetext • a few time&date-format variables -> survedat • text variables delivered separately • the text variables are separated form the numerical data • delivered separately, • can be merged using IDNR
Variable names • variable names and labels • variable names have up to 8 characters, f.e. commsat • all variable names have labels, f.e. Satisfaction with commuting time • variable labels refer to the question in the questionnaire • grouped variables start with the same letters • break labour market spell • cao collective bargaining agreement • chld child • cob country of birth • comm commuting • comp working with computer • con working conditions (health safety) • cont contract • dep department • wa amount in case of fringe benefit
Variables in CAPITAL • capitals used for • data from parallel questions, such as • FOR employees, self-employed, apprentices: How much time is needed to become fully effective in your job for someone with your qualifications? (edujobtr) • FOR School pupil, student: How much time is needed to become fully effective in your job? (edujobt1) • -> EDUJOBT • data from search trees, such as • NACE4NUM, for the question ‘Please select the main business activity of the organisation where you work” • computed data, such as • WAGEGRHR, hourly wage computed from the reported wage, working hours and wage period
Wage variables • WAGES • WAGEGRHR Hourly gross wage in national currency • WAGENEHR Hourly nett wage in national currency • WAGEGRHL Log hourly gross wage in national currency • WAGEGRWE Weekly gross wage in national currency (NOT standardised for hrs) • WAGEGRMO Monthly gross wage in national currency (NOT standardised for hrs) • WAGEGRAN Annual gross wage in national currency (NOT standardised for hrs) • BENEFITS • franperf Performance bonus (y/n) • WAPERFOM Amount of performance allowance • ->etc for all allowances • WASUM1 Sum benefits
Industrial relations • Collective bargaining coverage • CAOFIRM Covered by a collective agreement • caofirm8 If covered, what kind of agreement • Membership • memproff Member of a professional organisation • memstaff Member of a staff association • MEMTRAD Member of a labor union • memdele1 Member of a works council • memnone Member of none of these • MEMTRAD4 Member of which trade union (not all countries) • Blue versus white collar workers • CAOCATE (not for all countries applicable)
Occupation & industry • Occupation • we used to have various occupation search trees: • OCCUP_DK OCCUPA_D OCCUPA_I OCCUPA_N • these are recoded into ISCO4NUM, reflecting the 4 digit ISCO-1988 • Industry • NACE4NUM Business activity NACE4
Education • EDUCAT -> 1 variable for all educational categories • code label with country reference • 208081 DK Grundskole • 208082 DK Almen gymnasial • 208083 DK Erhvervsgymnasial • 826021 GB Nursery school, playgroups and reception classes • 826022 GB Adult literacy / numeracy qualification • 826023 GB Commercial / professional qualification • etc. • EDUISCED • Eduation level according to international ISCED classification • (not for all countries available) • EDUJOBT • Training needed for the job, • 8 categories, ranging from no training required to > 1 year
Region • REGIHOME • -> 1 variable measuring region for all countries • code label with country reference • 48131000 MX Yucatán • 48132000 MX Zacatecas • 52811 NL11 Groningen • 528111 NL111 Oost-Groningen • 528112 NL112 Delfzijl en omgeving • 528113 NL113 Overig Groningen • 5 Region variables • REGIHOME Region home address (all countries) • REGIWORK Region workplace (some countries) • REGIBIRT Region of birth (some countries) • POSTCOD2 Postcode 2 digit (NL and HU) • POSTCUSA Postcode (USA)
Ethnic group • Country of birth • COBSELF Country of birth self (measured in most countries) • COBMOTHE Country of birth mother (measured in some countries) • COBFATHE Country of birth father (measured in some countries • Ethnic group (measured in some countries) • code label with country reference • 156021 CN Hezhe • 156045 CN Tartar • 156018 CN Gaoshan • 356001 IN SC • 356002 IN ST • 356003 IN OBC • 356099 IN Other • 826001 GB White • 826002 GB Mixed • 826003 GB Asian / Asian British • etc • Language spoken at home • COBLANGU Language spoken at home (measured in most countries)
Households • Household composition • hhpartn1 Living with partner • hhchild Living with one or more children • hhgchild Living with grandchilderen • etc • Children • chld Do you have any children? y/n • CHLDHOME Number of children living at home • CHLDOUT Number of children not living at home • YYOLCHLD Year oldest child born • YYYOCHLD Year youngest child born • Civil status • HHSTAT Married, never married, widowed, divorced
Missing values • missing values • 9 User missing • 8 Not applicable • 7 I don’t know • 6 Not asked (random item) • 5 Not asked (release/country specific) • 3 Out of range • 2 Missing reason unknown • 1 Not asked (skipped) • system missing values • a variable has a system missing if the survey question is not asked in the country at stake • always check with RELEASE • when ever using a variable, always make a cross-table of the variable with RELEASE, for a better understanding of the variable
Using SPSS … 1 • Commands • frequencies • cross tables • multiple cross tables, basic tables • means • regressions (lineair, binary) • Save syntax files • use the drop-down menus for commands • save the commands using the ‘paste’ button • commands are pasted in a syntax file
Using SPSS … 2 • Create datasets • whenever starting an analysis, create with ‘save as’ a copy of the original datasetand use this copy only • use it for making selections • for creating new variables • save the new data file regularly
Thanks • See for many notes • research lab at www.wageindicator.org • Thank you for your attention!