250 likes | 381 Views
Modify—use bio. IB book IB Biology Topic 1: Statistical Analysis. http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm. An investigation of shell length variation in a mollusc species. A marine gastropod ( Thersites bipartita ) has been sampled from two different locations:
E N D
Modify—use bio. IB book IB Biology Topic 1: Statistical Analysis http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm
An investigation of shell length variation in a mollusc species • A marine gastropod (Thersites bipartita) has been sampled from two different locations: • Sample A: Shells found in full marine conditions • Sample B: Shells found in brackish water conditions. • sample size = 10 shells • length of the shell measured as shown Experimental DESIGN
Analysis of Gastropod Data • measured height of shells (ruler) • Units: mm + / - 1 mm (ERROR) • Significant digits • Uncertainty • all measuring devices! • reflects the precision of the measurement • There should be no variation in the precision of raw data must be consistent.
1.1.1 Error bars and the representation of variability in data. • Biological systems are subject to a genetic program and environmental variation • collect a set of data it shows variation • Graphs: show variation using error bars • show range of the data or • standard deviation
Mean & Range for each group • Marine • Brackish
Graph Mean & Range for each group • Quick comparison of the 2 data sets
1.1.2 Calculation of Mean and Std Dev • 3 classes of data • Mean • arithmetic mean (avg): measure of the central tendency (middle value) • Std Dev • Measures spread around the mean • Measure of variation or accuracy of measurement
1.1.2 Calculation of Mean and Std Dev • Std Dev of sample = s • is for the sample not the total population • Pop 1. Mean = 31.4 s = 5.7 • Pop 2. Mean =41.6 s = 4.3
Graphing Mean and Std Dev: Error Bars • Mean +/- 1 std dev • no overlap between these two populations • The question being considered is: • Is there a significant difference between the two samples from different locations? • or • Are the differences in the two samples just due to chance selection?
Graphing Mean and Std Dev: Error Bars StdDev graph compares 68% of the population % begins to show that they look different. Range graph : misleads us to think the data may be similar
1.1.3 Standard deviation and the spread of values around the mean. • StdDev is a measure of how spread out the data values are from the mean. • Assume: • normal distribution of values around the mean • data not skewed to either end • 68% of all the data values in a sample can be found between the mean +/- 1 standard deviation
http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm#gastrohttp://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm#gastro • Animation of mean and standard deviation
1.1.3 Standard deviation and the spread of values around the mean. 4. 95% of all the data values in a sample can be found between the mean + 2s and the mean -2s.
1.1.4 Comparing means and standard deviations of 2 or more samples. Sample w/ small StdDev suggests narrow variation Sample w/ larger StdDev suggests wider variation Example: molluscs Pop 1. Mean = 31.4 Standard deviation(s)= 5.7 Pop 2. Mean =41.6 Standard deviation(s) = 4.3
1.1.4 Comparing means and standard deviations of 2 or more samples. Pop 2 has a greater mean shell length but slightly narrower variation. Why this is the case would require further observation and experiment on environmental and genetic factors. http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm#gastro
1.1.5 Comparing 2 samples with t-Test Null Hypothesis: There is no significant difference between the two samples except as caused by chance selection of data. OR Alternative hypothesis: There is a significant difference between the height of shells in sample A and sample B. http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm#gastro
1.1.5 Comparing 2 samples with t-Test For the examples you'll use in biology, tails is always 2 , and type can be: 1, paired2,Two samples equal variance3, Two samples unequal variance
Good idea to graph it • Bar chart • Error bars • Stats
T-test: Are the mollusc shells from the two locations significantly different? • T-test tells you the probability (P) that the 2 sets are basically the same. (null hypothesis) • P varies from 0 (not likely) to 1 (certain). • higher P = more likely that the two sets are the same, and that any differences are just due to random chance. • lower P = more likely that that the two sets are significantly different, and that any differences are real.
T-test: Are the mollusc shells from the two locations significantly different? • In biology the critical P is usually 0.05 (5%) (biology experiments are expected to produce quite varied results) • If P > 5% then the two sets are the same • (i.e. accept the null hypothesis). • If P < 5% then the two sets are different • (i.e. reject the null hypothesis). • For t test, # replicates as large as possible • At least > 5
Drawing Conclusions 1. State null hypothesis & alternative hypothesis (based on research ?) 2. Set critical P level at P=0.05 (5%) 3. Write the decision rule— If P > 5% then the two sets are the same (i.e. accept the null hypothesis). If P < 5% then the two sets are different (i.e. reject the null hypothesis). 4. Write a summary statement based on the decision. The null hypothesis is rejected since calculated P = 0.003 (< 0.05; two-tailed test). 5. Write a statement of results in standard English. There is a significant difference between the height of shells in sample A and sample B.
1.1.6 Correlation & Causation • Sometimes you’re looking for an association between variables. • Correlations see if 2 variables vary together +1 = perfect positive correlation 0 = no correlation -1 = perfect negative correlation • Relations see how 1 variable affects another
Pearson correlation (r) • Data are continuous & normally distributed
Spearman’s rank-order correlation (r s) • Data are not continuous & normally distributed • Usually scatterplot for either type of correlation • both correlation coefficients indicate a strong + corr. • large females pair with large males • Don’t know why, but it shows there is a correlation to investigate further.
Causative: Use linear regression • Fits a straight line to data • Gives slope & intercept • m and c in the equation y = mx + c Doesn’t PROVE causation, but suggests it...need further investigation!