Statistics Workshop 2011

Statistics Workshop 2011 Ramsey A. Foty, Ph.D. Department of Surgery UMDNJ-RWJMS

“An unsophisticated forecaster uses statistics as a drunkard uses lamp-posts- for support rather than for illumination” Andrew Lang (1844-1912)…Scottish poet and novelist. “Then there is the man who drowned crossing a stream with an average depth of six-inches W.I.E. Gates…German Author Statistics: The only science that enables different experts using the same figures to draw different conclusions.” Evan Esar…American Humorist

Topics • Why do we need statistics? • Sample vs population. • Gaussian/normal distribution. • Descriptive Statistics. • Measures of location. • Mean, Median, Mode. • Measures of dispersion. • Range, Variance, Standard Deviation. • Precision of the mean. • Standard Error, Confidence Interval. • Outliers. • Grubb’s test. • The null hypothesis. • Significance testing. • Variability. • Comparing two means. • T-test • Group exercise • Comparing 3 or more groups. • ANOVA • Group Excercise • Linear Regression. • Power Analysis.

Why do we need statistics? • Variability can obscure important findings. • We naturally assume that observed differences are real and not due to natural variability. • Variability is the norm. • Statistics allow us to draw from the sample, conclusions about the general population.

Sample vs Population • Taking samples of information can be an efficient way to draw conclusions when the cost of gathering all the data is impractical. • If you measure the concentration of factor X in the blood of 10 people, does that accurately reflect the concentration of Factor X of the human race in general? How about from 100, 1000, or 10,000 people? How about if you sampled everyone on the planet?

Statistical methods were developed based on a simple model: • Assume that an infinitely large population of values exists and that your sample was randomly selected from a large subset of that population. Now, use the rules of probability to make inferences about the general population.

The Gaussian Distribution If samples are large enough, the sample distribution will be bell-shaped. The Gaussian function describing this shape is defined as follows: ; where m represents the population mean and s the standard deviation.

An example of a Gaussian distribution

Descriptive Statistics Measures of Location Measures of Dispersion Describe spread (variation) of the data around that central value. Range Variance Standard Deviation Standard Error Confidence Interval A typical or central value that best describes the data. • Mean • Median • Mode No single parameter can fully describe distribution of data in the sample. Most statistics software will provide a comprehensive table describing the distribution.

Measures of Location: Mean Mean Migration Assay • More commonly referred to as “the average”. • It is the sum of the data points divided by the number of data points. M=76.78 microns = 77 microns

Measures of Location: Median Median for odd sample size Migration assay • The value which has half the data smaller than that point and half the data larger. • For odd numbers, you first rank order then pick the middle number. • Therefore the 5’th number in the sequence is the median = 62 microns.

Measures of Location: Median Median for even sample size Ranked Unranked • Find the middle two numbers then find the value that lies between them. • Add two middle ones together and divide by 2. • Median is (7+13)/2=10. • The median is less sensitive for extreme scores than the mean and is useful for skewed data.

Measures of Location: Mode Mode • Value of the sample which occurs most frequently. • It’s a good measure of central tendency. • The Mode for this data set is 72 since this is the number with the highest frequency in the data set. • Not all data sets have a single mode. It’s only useful in very limited situations. • Data sets can be bi-modal.

Boxplots Largest observed value that is not an outlier 75’th percentile Median 25’th percentile Smallest observed value that is not an outlier 12, 13, 5, 8, 9, 20, 16, 14, 14, 6, 9, 12, 12 5, 6, 8, 9, 9, 12, 12 ,12, 13, 14, 14, 16, 20

Boxplots are used to display summary statistics

Measures of Location… do not provide information on spread or variability of the data

Measures of Dispersion • Describe the spread or variability within the data. • Two distinct samples can have the same mean but completely different levels of variability. • Which mean has a higher level of variability? 110 ± 5 or 110 ± 25 • Typical measures of dispersion include Range, Variance, and Standard Deviation.

Measures of Dispersion: Range Range For the cell migration data: Largest distance = 200 microns Smallest distance = 24 microns Range = 200-24 = 176 microns. NOT a reliable measure of dispersion of the whole data set. • The difference between the largest and smallest sample values. • It depends only on extreme values and provides no information about how the remaining data is distributed.

Measures of Dispersion: Variance Variance To calculate variance, it is first necessary to calculate the mean score then measure the amount that each score deviates from the mean. The formula for calculating variance is: • Defined as the average of the square distance of each value from the mean.

Why Square? • Squaring makes them all positive numbers (to eliminate negatives, which will reduce the variance. • Makes the bigger differences stand out, 1002 (10,000) is a lot bigger than 502(2500).

N vs N-1 N Size of the population N-1 Size of the sample

For the cell migration data, the sample variance is: NOT a very user-friendly statistic.

Measures of Dispersion:Standard Deviation Standard Deviation The formula to calculate standard deviation is: • The most common and useful measure of dispersion. • Tells you how tightly each sample is clustered around the mean. When the samples are tightly bunched together, the Gaussian curve is narrow and the standard deviation is small. • When the samples are spread apart, the Gaussian curve is flat and the standard deviation is large. SD = square root of the variance.

For this data set, the mean and standard deviation are: 77 ± 57 microns Conclusion: There’s lots of scatter in this data set.

But then again…. • This is a fairly small population (n=9). • What if we were to count the migration of 90, or 900, or 9000 cells. • Would this give us a better sense of what the average migration distance is? • In other words, how can we determine whether our mean is precise?

Precision of the Mean Standard Error For our data set: • A measure of how far the sample mean is away from the population mean. Increasing sample size does not change scatter in the data. SD may increase or decrease. Increasing sample size will, however, predictably reduce the standard error. SEM gets smaller as sample size increases since the mean of a larger sample is likely to be closer to the population mean.

Should we show standard deviation or standard error? Use Standard Deviation Use standard error If the variability is caused by experimental imprecision and you want to show the precision of the calculated mean. For example: You aliquot 10 plates of the same cell line and measure integrin expression of each. • If the scatter is caused by biological variability and you want to show that variability. • For example: You aliquot 10 plates each with a different cell line and measure integrin expression of each.

Precision of the Mean Confidence Intervals The formula for calculating CI: CI = X ± (SEM x Z) X is the sample mean and Z is the critical value for the normal distribution. For the 95% CI, Z=1.96. For our data set: 95% CI=77 ± (19x1.96)=77 ± 32 CI 95%=45-109 This means that there’s a 95% chance that the CI you calculated contains the population mean. • Combines the scatter in any given population with the size of that population. • Generates an interval in which the probability that the sample mean reflects the population mean is high.

CI: A Practical Example Between these two data sets, which mean do you think best reflects the population mean and why?

SD/SEM/95% CI error bars SD SEM 95% CI

Outliers • An observation that is numerically distant from the rest of the data. • Can be caused by systematic error, flaw in the theory that generated the data point, or by natural variability.

How to deal with outliers? • In general, we first quantify the difference between the mean and the outlier, then we divide by the scatter (usually SD). Grubb’s test For the cell migration data set: The mean is 77 microns. The Sample furthest from the mean Is the 200 micron point and the SD is 57. So:

What does a Z value of -2.16 mean? • In order to answer this question, we must compare this number to a probability value (P) to answer the following question: • “If all the values were really sampled from a normal population, what is the chance of randomly obtaining an outlier so far from the other values?” • To do this, we compare the Z value obtained with a table listing the critical value of Z at the 95% probability level. • If the computed Z is larger than the critical value of Z in the table, then the P value is less than 5% and you can delete the outlier.

For our data set: • Z calc (2.16) is less than Z Tab (2.21), so P is greater than 5% and the outlier must be retained.

The Null Hypothesis • Appears in the form Ho: m1 = m2 Where; Ho = null hypothesis m1 = mean of population 1 m2 = mean of population 2 • An alternate form is Ho: m1-m2=0 • The null hypothesis is presumed true until statistical evidence in the form of a hypothesis test proves otherwise.

Statistical Significance • When a statistic is significant, it simply means that the statistic is reliable. • It does not mean that it is biologically important or interesting. • When testing the relationship between two parameters we might be sure that the relationship exists, but is it weak or strong?

Strong vs weak relationships r2=0.2381 r2=1.000

Significance TestingType I and Type II errors • Type I error: a true null hypothesis can be incorrectly rejected. • False positive • Type II error: a false null hypothesis can fail to be rejected. • False negative

A Practical Example Type I error Type II error A Type II error, or a "false negative", is the error of failing to reject a null hypothesis when the alternative hypothesis is the true state of nature….i.e if a pregnancy test reports "negative" when the woman is, in fact, pregnant. • A pregnancy test has produced a "positive" result (indicating that the woman taking the test is pregnant); if the woman is actually not pregnant, then we say the test produced a "false positive". In significance testing we must be able to reduce the chance of rejecting a true null-hypothesis to as low a value as desired. The test must be so devised that it will reject the hypothesis tested when it is likely to be false.

Sources of Variability Random Error Systematic Error Is predictable, and typically constant or proportional to the true value. Systematic errors are caused by imperfect calibration of measurement instruments or imperfect methods of observation. Typically occurs only in 1 direction. • Caused by inherently unpredictable fluctuations in the readings of a measurement apparatus or in the experimenter's interpretation of the instrumental reading. • Can occur in either direction.

Some Examples

Repeatability/Reproducibility Repeatability Reproducibility The ability of a test or experiment to be accurately reproduced or replicated by someone else working independently. Cold fusion is an example of an un-reproducible experiment. • The variation in measurements taken by a single person or instrument on the same item and under the same conditions. • An experiment, if performed by the same person, using the same equipment, reagents, and supplies, must yield the same result.

Hypothesis Testing Observe Phenomenon Propose Hypothesis Statistics are an important Part of the study design Design Study vvv Collect and Analyze Data Interpret Results Draw Conclusions

Comparing Two Means • Are these two means significantly different? • Variability can strongly influence whether the means are different. Consider these 3 scenarios: Which of these will likely yield significant differences?

Comparing Two Means Student t-test N < 30 Independent data points, except when using a paired t-test. Normal distribution for equal and unequal variance Random sampling Equal sample size. Degrees of freedom important. Most useful when comparing 2 sample means. • Introduced in 1908 by William Sealy Gosset. • Gosset was a chemist working for the Guiness Brewery in Dublin. • He devised the t-test as a way to cheaply monitor the quality of Stout. • He was forced to use a pen-name by his employer-he chose to use the name Student.

The Student t-test • Given two data sets, each characterized by it’s mean, standard deviation, and number of samples, we can determine whether the means are significant by using a t-test. Note below that the difference between the means is the same but The variability is very different. A t-test is nothing more than a signal:noise ratio.

An Example • The null hypothesis states that there is no difference in the means between samples: • 1) Calculate means. • 2) Calculate SDs. • 3) Calculate SEs. • 4) Calculate t-value. • 5) Compare tcalc to ttab. • 6) Accept/reject Ho.

Plot Data Box Plot Bar Graph

1) Calculate Mean

2) Calculate SD

Statistics Workshop 2011

Statistics Workshop 2011

Presentation Transcript

Workshop on Gender Statistics

Life Sciences Statistics 2011

Cancer Statistics 2011

2011 MWBE Workshop

Resco Workshop 2011

2011 US Diabetes Statistics

SME Statistics OECD Workshop

Statistics about CBPM 2011

ENVIRONMENT STATISTICS WORKSHOP,ADDIS ABABA, ETHIOPIA , 07 – 11 March 2011

with Statistics Workshop

2011 Student Employment Statistics

MSIS 2011 – Statistics Sweden

-- 2011 Keyword statistics

Life Sciences Statistics 2011

STATISTICS WORKSHOP - 2

Quantitative Skills Workshop Statistics

WORKSHOP ON WASTE STATISTICS

Statistics for Everyone Workshop Summer 2011

WORKSHOP 2011

SME Statistics OECD Workshop

Statistics for Everyone Workshop Summer 2011