350 likes | 579 Views
Analysing and Presenting Quantitative Data:. Inferential Statistics. Objectives. After this session you will be able to: Choose and apply the most appropriate statistical techniques for exploring relationships and trends in data (correlation and inferential statistics).
E N D
Analysing and Presenting Quantitative Data: Inferential Statistics
Objectives After this session you will be able to: • Choose and apply the most appropriate statistical techniques for exploring relationships and trends in data (correlation and inferential statistics).
Stages in hypothesis testing • Hypothesis formulation. • Specification of significance level (to see how safe it is to accept or reject the hypothesis). • Identification of the probability distribution and definition of the region of rejection. • Selection of appropriate statistical tests. • Calculation of the test statistic and acceptance or rejection of the hypothesis.
Hypothesis formulation Hypotheses come in essentially three forms.Those that: • Examine the characteristics of a single population (and may involve calculating the mean, median and standard deviation and the shape of the distribution). • Explore contrasts and comparisons between groups. • Examine associations and relationships between groups.
Specification of significance level – potential errors • Significance level is not about importance – it is how likely a result is to be probably true (not by chance alone). • Typical significance levels: • p = 0.05 (findings have a 5% chance of being untrue) • p = 0.01 (findings have a 1% chance of being untrue) [
Nominal groups and quantifiable data (normally distributed) To compare the performance/attitudes of two groups, or to compare the performance/attitudes of one group over a period of time using quantifiable variables such as scores. Use paired t-test which compares the means of the two groups to see if any differences between them are significant. Assumption: data are normally distributed.
Data outputs: test for normality Case Processing Summary Tests of Normality a Lilliefors Significance Correction
Statistical output Paired Samples Statistics Paired Samples Test
Nominal groups and quantifiable data (normally distributed) To compare the performance/attitudes of two groups, or to compare the performance/attitudes of one group over a period of time using quantifiable variables such as scores. Use Mann-Whitney U. Assumption: data are not normally distributed.
Statistical output Tests of Normality Ranks a Lilliefors Significance Correction Test Statistics(a) a Grouping Variable: Sex Ranks Ranks
Association between two nominal variables We may want to investigate relationships between two nominal variables – for example: • Educational attainment and choice of career. • Type of recruit (graduate/non-graduate) and level of responsibility in an organization. • Use chi-square when you have two or more variables each of which contains at least two or more categories.
Statistical output Chi-Square Tests a Computed only for a 2x2 table b 0 cells (.0%) have expected count less than 5. The minimum expected count is 33.08. Symmetric Measures a Not assuming the null hypothesis. b Using the asymptotic standard error assuming the null hypothesis.
Correlation analysis Correlation analysis is concerned with associations between variables, for example: • Does the introduction of performance management techniques to specific groups of workers improve morale compared to other groups? (Relationship: performance management/morale.) • Is there a relationship between size of company (measured by size of workforce) and efficiency (measured by output per worker)? (Relationship: company size/efficiency.) • Do measures to improve health and safety inevitably reduce output? (Relationship: health and safety procedures/output.)
Strength of association based upon the value of a coefficient
Calculating a correlation for a set of data We may wish to explore a relationship when: • The subjects are independent and not chosen from the same group. • The values for X and Y are measured independently. • X and Y values are sampled from populations that are normally distributed. • Neither of the values for X or Y is controlled (in which case, linear regression, not correlation, should be calculated).
Associations between two ordinal variables For data that is ranked, or in circumstances where relationships are non-linear, Spearman’s rank-order correlation (Spearman’s rho), can be used.
Statistical output Correlations ** Correlation is significant at the 0.01 level (2-tailed).
Association between numerical variables We may wish to explore a relationship when there are potential associations between, for example: • Income and age. • Spending patterns and happiness. • Motivation and job performance. Use Pearson Product-Moment (if the relationships between variables are linear). If the relationship is or -shaped, use Spearman’s rho.
Statistical output Descriptive Statistics Correlations ** Correlation is significant at the 0.01 level (2-tailed).
Summary • Inferential statistics are used to draw conclusions from the data and involve the specification of a hypothesis and the selection of appropriate statistical tests. • Some of the inherent danger in hypothesis testing is in making Type I errors (rejecting a hypothesis when it is, in fact, true) and Type II errors (accepting a hypothesis when it is false). • For categorical data, non-parametric statistical tests can be used, but for quantifiable data, more powerful parametric tests need to be applied. Parametric tests usually require that the data are normally distributed.