Statistics Review

Statistics Review Levels of Measurement

Levels of Measurement Nominal scale • Nominal measurement consists of assigning items to groups or categories. • No quantitative information is conveyed and no ordering of the items is implied. • Nominal scales are therefore qualitative rather than quantitative. • Examples: Religious preference, race, and gender are all examples of nominal scales • Statistics: Sum, Frequency Distributions

Ordinal Scale • Measurements with ordinal scales are ordered: higher numbers represent higher values. • However, the intervals between the numbers are not necessarily equal. • There is no "true" zero point for ordinal scales since the zero point is chosen arbitrarily. • For example, on a five-point Likert scale, the difference between 2 and 3 may not represent the same difference as the difference between 4 and 5. • Also, lowest point was arbitrarily chosen to be 1. It could just as well have been 0 or -5.

Interval & Ratio Scales • On interval measurement scales, one unit on the scale represents the same magnitude on the trait or characteristic being measure across the whole range of the scale. • For example, on an interval/ratio scale of anxiety, a difference between 10 and 11 would represent the same difference in anxiety as between 50 and 51.

Histograms Statistics Review

What can histograms tell you A convenient way to summarize data (especially for larger datasets) Shows the distribution of the variable in the population Gives an approximate idea of the summary and spread of the variable

400 300 200 100 0 0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 5.0 15.0 25.0 35.0 45.0 55.0 65.0 75.0 85.0 95.0 Distribution of No of Graphics on web pages (N=1873) Mean = 17.93 Median = 16.00 Std. Dev = 17.92 N = 1873 Graphic Count

Statistics Review Mean and Median

Mean and Median Mean is arithmetic average, median is 50% point Mean is point where graph balances • Mean shifts around, • Median does not shift much, is more stable • Computing Median: • for odd numbered N • find middle number • For even numbered N • interpolate between middle 2, • e.g. if it is 7 and 9, then 8 is the median

800 600 400 200 0 WORDCNT2 0.0 800.0 400.0 1200.0 2000.0 3200.0 1600.0 2400.0 2800.0 3600.0 4000.0 Instability of meanDistribution of word count (N=1897) Mean = 368.0 Median = 223 Minimum = 0 Maximum = 4132

500 400 300 200 100 0 0.0 800.0 400.0 600.0 200.0 1600.0 1000.0 2200.0 2400.0 1400.0 1800.0 1200.0 2000.0 Distribution of word count (N=1873), top 1% removed Mean = 333.4 Median = 220 Minimum = 0 Maximum = 4132 WORDCNT2

Statistics Review Standard Deviation

Standard Deviation: a measure of spread The SD says how far away numbers on a list are from their average. Most entries on the list will be somewhere around one SD away from the average. Very few will be more than two or three SD’s away.

Properties of the standard deviation • The standard deviation is in the same units as the mean • The standard deviation is inversely related to sample size (therefore as a measure of spread it is biased) • In normally distributed data 68% of the sample lies within 1 SD

Statistics Review Normal Probability Curve

Properties of the Normal Probability Curve • The graph is symmetric about the mean (the part to the right is a mirror image of the part to the left) • The total area under the curve equals 100% • Curve is always above horizontal axis • Appears to stop after a certain point (the curve gets really low)

1 SD= 68% 2 SD = 95% 3 SD= 99.7% • The graph is symmetric about the mean = • The total area under the curve equals 100% • Mean to 1 SD = +- 68% • Mean to 2 SD = +- 95% • Mean to 3 SD = +- 99.7% • You can disregard rest of curve

It is a remarkable fact that many histograms in real life tend to follow the Normal Curve. For such histograms, the mean and SD are good summary statistics. The average pins down the center, while the SD gives the spread. For histogram which do not follow the normal Curve, the mean and SD are not good summary statistics. What when the histogram is not normal ...

Use inter quartile range 75th percentile - 25th percentile Can be used when SD is too influenced by outliers Note. A percentile is a score below which a certain % of sample is

Statistics Review Population and Sample

An investigator usually wants to generalize about a class of individuals/things (the population)For example: in forecasting the results of elections, population is all eligible voters

Usually there are some numerical facts about the population (parameters) which you want to estimate • You can do that by measuring the same aspect in the sample (statistic) • Depending on the accuracy of your measurement, and how representative your sample is, you can make inferences about the population

Statistics Review Scatter Plots and Correlations

4 y x x x x x x x x x x x x x x x x x x x x x x Example Scatterplots High correlation Low correlation y x x x x x x x x x x x x x x x x x x x x x x

What is a Correlation Coefficient • A measure of degree of relationship between two variables. • Sign refers to direction. • Based on covariance • Measure of degree to which large scores go with large scores, and small scores with small scores • Pearson’s correlation coefficient is most often used

Factors Affecting r • Range restrictions • Outliers • Nonlinearity • e.g. anxiety and performance • Heterogeneous subsamples • Everyday examples

80 60 40 20 0 -20 Pros -40 -40 -20 0 20 40 60 80 The effect of outliers on correlations Dataset: 20 cases selected from darts and pros DARTS r = .80

Effect of linear transformations of data • no effect on Pearson's correlation coefficient. • Example: r between height and weight is the same regardless of whether height is measured in inches, feet, centimeters or even miles. • This is a very desirable property since choice of measurement scales that are linear transformations of each other is often arbitrary.

13 Non linear relationships Example: Anxiety and Performance r = .07

The interpretation of a correlation coefficient • Ranges from –1 to 1 • No correlation in the data means you will get a is 0 r or near it • Suffers from sampling error (like everything else!). So you need to estimate true population correlation from the sample correlation.

Statistics Review Hypothesis Testing

Null and Alternative Hypothesis • Sampling error implies that sometimes the results we obtain will be due to chance (since not every sample will accurately resemble the population) • The null hypothesis expresses the idea that an observed difference is due to chance. • For example: There is no difference between the norms regarding the use of email and voice mail

The alternative hypothesis • The alternative hypothesis (the experimental hypothesis) is often the one that you formulate: • For example: There is a correlation between people’s perception of a website’s reliability and the probability of their buying something on the site • Why bother to have a null hypothesis? • Can you reject the null hypothesis

One Tailed and Two Tailed tests One tailed tests: Based on a uni-directional hypothesis Hypothesis: Training will reduce number of problems users have with Powerpoint Two tailed tests: Based on a bi-directional hypothesis Hypothesis: Training will change the number of problems with PP

Sampling Distribution Population for usability of Powerpoint 1400 1200 1000 800 Frequency 600 400 Std. Dev = .45 200 Mean = 5.65 N = 10000.00 0 3.75 4.75 5.25 6.25 7.00 7.25 4.00 4.25 4.50 5.00 5.50 5.75 6.00 6.50 6.75 Mean Usability Index Implications of one and two tailed tests Identify region Unidirectional hypothesis: .05 level Bidirectional hypothesis: .05 level

Significance levels • PowerPoint example: • If we set significance level at .05 level, • 5% of the time we will find a difference by chance • 95% of the time the difference will be real • If we set significance level at .01 level • 1% of the time we will find a difference by chance • 99% of time difference will be real • What happens if we decrease our significance level from .01 to .05 • Probability of finding differences that don’t exist goes up (criteria becomes more lenient)

Effect of decreasing significance level from .01 to .05 • Probability of finding differences that don’t exist goes up • Also called Type I error (Alpha) • Effect of increasing significance from .01 to .001 • Probability of not finding differences that exist goes up • Also called Type II error (Beta)

Significance levels for usability • For usability, if you are set out to find problems: setting lenient criteria might work better (you will identify more problems)

Degree of Freedom • The number of independent pieces of information remaining after estimating one or more parameters • Example: List= 1, 2, 3, 4 Average= 2.5 • For average to remain the same three of the numbers can be anything you want, fourth is fixed • New List = 1, 5, 2.5, __ Average = 2.5

Statistics Review Comparing Means: t tests

Major Points • T tests: are differences significant? • One sample t tests, comparing one mean to population • Within subjects test: Comparing mean in condition 1 to mean in condition 2 • Between Subjects test: Comparing mean in condition 1 to mean in condition 2

One sample t test • Mean of population known, but standard deviation (SD) not known • Compute t statistic • Compare t to tabled values (for relevant degree of freedom) which show critical values of t

Factors Affecting t • Difference between sample and population means • Magnitude of sample variance • Sample size

Factors Affecting Decision • Significance level • One-tailed versus two-tailed test

Within subjects/ Repeated Measures / Related Samples t test • Correlation between before and after scores • Causes a change in the statistic we can use • Advantages of within subject designs • Eliminate subject-to-subject variability • Control for extraneous variables • Need fewer subjects

Disadvantages of Within Subjects • Order effects • Carry-over effects • Subjects no longer naïve • Change may just be a function of time • Sometimes not logically possible

Between subjects t test • Distribution of differences between means • Heterogeneity of Variance • Nonnormality

Assumptions of Between Subjects t tests • Two major assumptions • Both groups are sampled from populations with the same variance • “homogeneity of variance” • Both groups are sampled from normal populations • Assumption of normality • Frequently violated with little harm.

Statistics Review Analysis of Variance

Analysis of Variance • ANOVA is a technique for using differences between sample means to draw inferences about the presence or absence of differences between populations means. • Similar to t tests in two sample case • Can handle cases where there are more than two samples

Statistics Review