250 likes | 520 Views
Quantifying Data. Data Entry. Define variables, enter case data, conduct runs Coding and Recoding If numeric values not pre-assigned, decide on coding system If there is open-ended data, would need to decide how to deal with responses Defining your variables. Data Cleaning.
E N D
Data Entry • Define variables, enter case data, conduct runs • Coding and Recoding • If numeric values not pre-assigned, decide on coding system • If there is open-ended data, would need to decide how to deal with responses • Defining your variables
Data Cleaning • Reread each set of responses back (immediately) to confirm accuracy • “Possible-code cleaning” • easiest way to check is to run a frequency distribution • Contingency cleaning • On the “if” questions • “Sort” by response • do you recycle… then check the “what do you recycle” variable • Can also run cross tabs and make sure cells are empty
Basic Analysis – Measures of Central Tendency • Mean: sum of values divided by the number of cases • simple average • Median: middle attribute in a list of observed attributes • extreme cases eliminated • Mode: most frequently occurring attribute • used with nominal variables, i.e.. sex • most respondents were women • usually report with percentage, 60% were women
Cross Tabs • Used often with Bivariate data • Convention usually places • “independent variables” across top in columns • “dependent variables” in rows below
Coding and data entry options • Transfer sheets are special forms ruled off in 80 columns • Edge coding involves recording code #'s in margins of questionnaires • Direct data entry involves entering data directly into computer; eliminating transfer sheets • Data entry by interviewer (CATI) • Optical scan sheets
Coding • What is it? • It is the assignment of numerical values to information or responses gathered by a research instrument • Codebook: describes the locations of variables and lists the codes assigned to the attributes of the variables
Data Management Process • concerned with the process by which raw data gathered by some instrument are converted into numbers for analysis purposes
Collect information with data gathering instrument • Use codebook to transfer this information to a transfer sheet or code sheet (optional) • Create data file from information on code sheet by entering data from a computer keyboard • Check/clean up data file for accuracy • Data cleaning done by • Computer edit programs • Examine distributions • Contingency cleaning
What about open-ended items? • Read through responses a create a preliminary code based on responses • If more than 10% of responses fall into "other" category, code needs to be revised to include many of these responses
Elementary Quantitative Analyses • To understand the meaning of univariate, bivariate, and multivariate analysis • To become familiar with the meaning of several univariate and bivariate statistics
Analysis Strategies • Why do we have to have them? • People who read our ‘research’ are interested in the highlights • Should try to communicate findings in an understandable and ‘painless fashion’
Three types of analysis • Univariate analysis • the examination of the distribution of cases on only one variable at a time (e.g., college graduation) • Bivariate analysis • the examination of two variables simultaneously (e.g., the relation between gender and college graduation) • Multivariate analysis • the examination of more than two variables simultaneously (e.g., the relationship between gender, race, and college graduation)
“Purpose” • Univariate analysis • Purpose: description • Bivariate analysis • Purpose: determining the empirical relationship between the two variables • Multivariate analysis • Purpose: determining the empirical relationship among the variables
Types of Statistics • Techniques that summarize and describe characteristics of a group or make comparisons of characteristics between groups are knows as descriptive statistics. • Inferential statistics are used to make generalizations or inferences about a population based on findings from a sample. • The choice of a type of analysis is based on the evaluation questions, the type of data collected, and the audience who will receive the results.
Univariate Analysis • Involves examination of the distribution of cases on only ONE variable at a time • Frequency distributions are listings of the number of cases in each attribute of a variable • Ungrouped frequency distribution • Grouped frequency distribution • Proportions express number of cases of the criterion variable as part of the total population; frequency of criterion variable divided by N
Percentages are simple 100 X proportion • Or [100 X (frequency of criterion variable divided by N)] • Rates make comparisons more meaningful by controlling for population differences
Measures of Central Tendency • Measures of central tendency reflect the central tendencies of a distribution • Mode reflects the attribute with the greatest frequency • Median reflects the attribute that cuts the distribution in half • Mean reflects the average; sum of attributes divided by # of cases
Measures of Dispersion • Measures of dispersion reflect the spread or distribution of the distribution • Range is the difference between largest & smallest scores; high – low • Variance is the average of the squared differences between each observation and the mean • Standard deviation is the square root of variance
Types of Variables • Continuous: increase steadily in tiny fractions • Discrete: jumps from category to category
Subgroup Comparisons • Somewhere between univariate & bivariate, are Subgroup Comparisons • Present descriptive univariate data for each of several subgroups • Ratios: compare the number of cases in one category with the number in another
Bivariate Analysis • Bivariate analysis focus on the relationship between two variables
Contingency Tables • Format: attributes of independent variable are used as column headings and attributes of the dependent variable are used as row headings • Guidelines for presenting & interpreting contingency tables • Contents of table described in title • Attributes of each variable clearly described • Base on which percentages are computed should be shown • Norm is to percentage down & compare across • Table should indicate # of cases omitted from analysis
Multivariate Analysis • Multivariate Analysis allow the separate and combined effects of the independent variable to be examined