Action Research Review

Action ResearchReview INFO 515 Glenn Booker Lecture #10

Why do we do this? • Measurements are needed to understand a system, and predict its future behavior • Statistical techniques provide a commonly accepted means of analyzing measurements • Statistics is based on recognizing that measurements tend to fall over a range of values, not just one precise number Lecture #10

Historical (what happened?) Descriptive (what is happening?) Developmental (over time) Case and Field (study an organization) Correlational (does A affect B?) Causal Comparative (what caused it) True Experimental (single / double blind) Quasi-Experimental Action Research Types of Research Lecture #10

Data Analysis • Raw data, such as one survey result • Refined data, such as the distribution of ages of Philadelphia residents • Derived data, such as comparing the age distribution of Philadelphia residents to that of the country Lecture #10

Population vs. Sample • Often the subject of interest (population) is so big it isn’t feasible to measure it all • Then a sample of measurements can be made, and we want to relate the sample measurement to the population Lecture #10

Sampling • Sampling can be done using probabilistic techniques (e.g. various random samples) • Simple or stratified random, • Cluster (geographic), or • Systematic (every Nth) samples • Or using non-probabilistic methods (whoever’s convenient, specific groups, or experts) Lecture #10

Customer Satisfaction Surveys • A special case of sampling, customer satisfaction surveys are often done using: • In person interview • Telephone interview • Questionnaire by mail • Sample sizes are based on the allowable error, population size, and the result obtained Lecture #10

Measurement Scales • Measurements can use four major types of scales; the types of analysis possible depend strongly on the type of measurements used • Nominal (named buckets, without sequence) • Ordinal (ordered buckets) • Interval (intervals mean something, can +-) • Ratio (you can form ratios, can +-*/ ) Lecture #10

Discrete versus Continuous • Discrete (nonparametric) measurements use nominal or ordinal scales; only specific values are allowed • Car make = Chevy, or cost = High • Continuous (parametric) measurements use interval or ratio scales, and generally have integer or real number values • Temperature = 98.6 deg F, Height = 172.1 cm Lecture #10

Descriptive Statistics • Many common statistics can describe the central tendency of a set of measurements • Average (arithmetic mean) • Minimum, Maximum, Range • Median (middle value) • Mode (most common value) Lecture #10

Normal Distribution • Many measurements can be described by a “normal” distribution, which is summarized by an average value and a standard deviation, s or s • We can predict how likely any range of values is to occur for a normal distribution (how often is X between 5 and 8?) Lecture #10

Z Score • Z scores measure how far from the mean a single measurement isz = (Xi - m) / s • Same formula used for finding “t” too • Does not only apply to a normal distribution, but if it does, then we can predict the probability of that value or higher/lower occurring Lecture #10

Standard Error • A sample of N measurements will have a standard error SEx = s / sqrt(N) • The standard error allows us to define the confidence interval, CICI = mean +/- crit*SExwhere “crit” is the critical z score for a large sample, or the critical t score for a small sample Lecture #10

Critical z and t • The critical z score is only a function of the desired confidence level of the results (zc = 1.96 for 95% confidence level) • Critical t score is a function of the sample size (degrees of freedom, df = n-1) and the desired confidence level • As df gets very large, critical t  critical z Lecture #10

Confidence Level • We have to accept some level of uncertainty in a statistical analysis – our conclusion might be wrong! • Generally, a 95% level of confidence is used, unless life is on the line - then a 99% level of confidence is required • Use 95% typically, hence critical significance is 0.050 Lecture #10

Confidence Level • The level of confidence of your results, plus the critical significance, always equals exactly one • For practically every statistical test, having the Significance of the result less than the critical value means to reject the null hypothesis • If Sig actual < Sig crit, reject null hypothesis Lecture #10

Frequency and Percentage • Frequency graphs and crosstabs can provide a lot of information just from counts of a nominal or ordinal measurement occurring, possibly given with the percentages of each event’s occurrence • Histograms can provide similar charts for ratio or interval scaled data Lecture #10

Scatterplots • Scatter plots or diagrams show the relationship between two or more measures • The horizontal axis is generally the independent variable (X), sometimes also called a factor or grouping variable • The vertical axis is generally the dependent variable (Y), which is the measure you’re trying to understand Lecture #10

Hypothesis Testing • Some statistics are used in the context of testing a hypothesis - a statement whose truth you wish to determine • Are Philadelphians more likely to be Nobel Prize winners? • The Null hypothesis is the opposite of the hypothesis, and generally says there is no difference or no effect observed • Philadelphians no more likely to be Nobel Prize winners than any other group Lecture #10

Hypothesis Testing • Can’t truly PROVE anything - only determine if the differences observed are “not likely to be due to chance” • Select one or more “Tests of Significance” to determine if there is a statistically significant difference (Yes/No); if Yes, then can • Select one or more “Measures of Association” to describe the strength of the difference, and possibly its direction Lecture #10

One versus Two Tailed Tests • A null hypothesis which tests for “no difference” uses a two tailed test • A null hypothesis which specifically tests for “greater than” uses a one tailed test • A null hypothesis which specifically tests for “less than” uses a one tailed test • One versus two tailed changes the critical z or t score; generally makes the test easier to show significance – that’s why two-tailed tests are used Lecture #10

Z or T Test • The z or t tests can be used to compare two distribution means, or compare one distribution mean to a fixed value (interval or ratio data) • Compare the actual z or t score to the critical z or t score • If the actual z or t score is closer to zero than the critical value, accept the null hypothesis Lecture #10

Z or T Test (Two Tailed) Notice this is for the x or t value, NOT the significance of that value Lecture #10

Z or T Test (One Tailed) (Case here is testing if the actual value is greater than the mean; for a “less than” case, use only the negative critical value.) Lecture #10

Is My Sample Normal? • Boxplots and stem-and-leaf diagrams can help show graphically whether a sample has a fairly normal distribution • The skewness and kurtosis of a data set can help identify non-normality, if their values are more than two times their own standard errors Lecture #10

T Tests • T tests compare means for ratio or interval data • Independent t test is for two different strata within one data set • Paired t test is to compare measures of the same group before and after some event (drug test), or the samples are otherwise believed to be dependent on each other • One-sample t test compares one sample to a fixed value Lecture #10

T Tests • Null hypothesis is that there is no difference between the means • Results (e.g. significance) may differ if variances are not equal, since df changes • The Levene test checks for equal variances • Null hypothesis for the Levene test is that the variances are equal • If the Levene significance < 0.050, variances are not equal (reject the null hypothesis) Lecture #10

Independent T Test Evaluation • Three ways to check the results of a T test • If the T test’s significance < 0.050, reject the null hypothesis • Check the stated t value against the critical t value for this ‘df’ level; if t(actual) > t(critical) reject the null hypothesis • If the confidence interval for the difference between the means does not include zero, reject the null hypothesis Lecture #10

Evaluating Significance Lecture #10

Paired T Test Evaluation • Checks before and after test cases • Includes a correlation factor (like ‘r’) • Can use paired test if significance < 0.050 • Larger correlation factor means stronger relationship between the variables • Test evaluation as Independent T Test • Significance, ‘t’ value, and confidence interval Lecture #10

One-Sample T Test • Compare a sample mean to a fixed value • Test shows the actual values of means, with their std deviation and std error • Same interpretation of results • Significance, ‘t’ value, and confidence interval Lecture #10

F Test and ANOVA • Compare several means against each other using Analysis of Variance (ANOVA) and the F test • Like extending the T tests to many variables • Want data from random samples of normal populations with equal variances Lecture #10

F Test and ANOVA • Output includes the Levene test • Want significance for Levene > 0.050, so that equal variances can be assumed • Otherwise, should not use ANOVA • Evaluate F by its significance • If Sig. < 0.050, reject the null hypothesis (there is a significant difference among the means) Lecture #10

Additional ANOVA Tests • Once the F test shows there is some difference in the means across a subset, additional ANOVA tests can help identify more specific trends and differences • Types of tests (see end of lecture 6) include • Pairwise Multiple Comparisons • Post Hoc Range Tests Lecture #10

Pairwise Multiple Comparisons • Pairwise Multiple Comparisons check two subsets of data at a time • Bonferroni test is better for a small number of subsets • Tukey test is better for many subsets • Both assume subset variances are equal • For each pair of subset values, Sig < 0.050 means the difference in means is significant Lecture #10

Post Hoc Range Tests • Post Hoc Range Tests look for groups within each subset which all have similar variances • Tukey and Tukey’s-b tests include Post Hoc Range Tests • Each column of the output is a subset with statistically similar means • Subsets may overlap substantially Lecture #10

Contrasts Across Means • Look across subset means to see if there is a trend, such as a linear increase or decrease across subsets • Can check for Linear, Quadratic, or Cubic relationships • (i.e. first, second, or third order polynomials) • Check Significance of F for the Unweighted version of each relationship (Linear, etc.) if Sig. < 0.050, reject the null hypothesis Lecture #10

Determine Linearity • An option under Compare Means / Means allows checking just for linearity • This confirms the ANOVA test result for Linearity • And gives R and Eta parameters, which are Measures of Association Lecture #10

R and Eta • Pearson’s R * measures how well the data fits the regression (-1 is a perfect negative correlation, 0 is no relationship, 1 is perfect positive correlation), and describes the amount of shared variance between them • Eta squared gives how much of the variance in one variable is caused by the changes in the other variable * Named for English statistician Karl Pearson, 1857-1936 (per http://human-nature.com/nibbs/03/kpearson.html) Lecture #10

Regression Analysis • Regression Analysis looks at two interval or ratio-scaled variables (generically X and Y) and tries to fit an equation between them • A dozen different equations are available • Linear, Power, Logarithmic, Exponential, etc. • Significance is checked by ANOVA F, and Sig. of the regression coefficients; association is measured with R Squared Lecture #10

Regression Analysis • For a regression to have any significance, we must have ANOVA’s Sig. F < 0.050 • Then each variable’s coefficient (b0, b1, etc.) must have significance < 0.050 • Otherwise the coefficient might be zero • Then the better regression equations are ranked in order of strength by R Square, which is confirmed visually by plotting Lecture #10

Regression Analysis • The standard error of coefficients is given, so confidence intervals can be formed • Also helps report them meaningfully, so you don’t report a value as 4.861435 if it has a standard error of 0.92 • Depending on the accuracy of the source data, you could report that result as 5 +/- 1, or 4.9 +/- 0.9, or 4.86 +/- 0.92 Lecture #10

Crosstabs • Crosstabs display data sorted by two or more variables in table form • Often just counts of each category, and/or the percentage of counts • Recoding data allows interval or ratio scale data to be put into groups (e.g. age 18-25) Lecture #10

Pearson’s Chi Square • Measures how well the actual (observed) data differs from a even (expected) distribution of data • The “expected” data can be a random distribution (same number of counts per cell), or adjusted for the actual total counts for each row and column Lecture #10

Pearson’s Chi Square Evaluation • When chi square is larger than the critical value, reject the null hypothesis • Or if the significance of chi square is < 0.050, reject the null hypothesis • Can also generate Chi square for a single variable • Beware that Chi square is less meaningful for large matrices • Or, it’s too easy for large matrices to show significance falsely using Chi square Lecture #10

Residuals • A residual is the difference between the Observed and Estimated values for a cell • Residuals can be plotted to look for outliers • Residuals can be standardized by dividing by their standard deviation • Cells with a standardized residual magnitude > 2 contribute a lot to Chi square Lecture #10

Measures of Association • Measures of Association between two variables can be symmetric or directional • Dozens of measures have been developed to work with chi square test • Interpret them like ‘r’ - zero means no correlation, larger values mean a stronger correlation • Some can be > 1 Lecture #10

Measures of Association • Symmetric measures don’t care which variable is dependent (Y) • Directional measures DO care which variable is dependent (A = f(B) is not B = f(A)) • Some directional measures have a “symmetric” value, the weighted average of the other two Lecture #10

Symmetric Measures • The “Contingency Coefficient” is the main symmetric measure with a Chi Square test • Works even with nominal data • Evaluated like Pearson’s r • Phi and Cramer’s V are other symmetric measures Lecture #10

Directional Measures • Directional measures range from 0 to 1 • Lambda is the recommended directional measure - tells what proportion of the dependent variable is predicted by the independent variable (like Eta) • Eta can be applied here if one variable is interval or ratio scaled Lecture #10

Action Research Review

Action Research Review

Presentation Transcript

ACTION RESEARCH

ACTION RESEARCH

Action Research

Action research

Action Research

Action Research

ACTION RESEARCH

Action Research

Action Research

Action Research

Action Research

Action Research

Action Research

ACTION RESEARCH

Action Research

Action Research

Action Research

Action Research

Action Research

ACTION RESEARCH