E N D
STATISTICAL ANALYSIS I DP Biology
Error Bars: Biological systems are subject to a genetic program and environmental variation. Consequently when we collect a set of data for a given variable it shows variation. When displaying data in graphical formats we can show the variation using error bars.
Bars on graphs extending above or below the mean value are used to show the variability of the data. Error bars are a graphical representation of the variability of data. Error bars can be used to show either the range of the data or the standard deviation.
Calculate the mean and standard deviation of a set of values: Students will not be expected to know the formulas for calculating these statistics. They will be expected to use the standard deviation function of a graphic display or scientific calculator. Students could be also taught how to calculate standard deviation using a spreadsheet computer program. Students should specify the standard deviation (s), not the population standard deviation.
The term standard deviation is used to summarize the spread of values around the mean. It also used to state that 68% of the values fall within one standard deviation of the mean. For normally distributed data, about 68% of all values lie within ±1 standard deviation (s or σ) of the mean. This rises to about 95% for ±2 standard deviations.
Explain how the standard deviation is useful for comparing the means and the spread of data between two or more samples: A small standard deviation indicates that the data is clustered closely around the mean value. Conversely, a large standard deviation indicates a wider spread around the mean.
A test of the null hypothesis that the means of two normally distributed populations are equal. Take two data sets and find the mean, standard deviation and number of data points. Use a t test to determine whether the means are distinct, provided that the underlying distributions can be assumed to be normal.
T-test assumptions: the two samples are unpaired, independent of each other (e.g., individuals randomly assigned into two groups, measured after an intervention and compared with the other group). If the calculated value is below the threshold chosen for statistical significance then the null hypothesis is rejected in favour of an alternative hypothesis, which states that the groups do differ.
The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate as the analysis for the post-test-only two-group randomized experimental design.
Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables. For the t-test to be applied, the data must have a normal distribution and a sample size of at least 10. The t-test can be used to compare two sets of data and measure the amount of overlap. Students will not be expected to calculate values of t. Only a two-tailed, unpaired t-test is expected.
Correlation v Causation There is a positive correlation between the lengths of the right hand and right feet of teenager boys – boys with larger hands tend to have larger feet as well. Although there is a positive correlation between hand and foot length, we know that increases in the length of the hand do not cause increase in length of the foot. Instead, both are due to the factors that control growth in teenage boys.
This mistake is often made in analysis of data – a correlation between two variables is assumed to show that there is a casual link. It is important to remember that correlation is not proof of cause. Explain that the existence of a correlation does not establish that there is a casual relationship between two variables.