1 / 25

Statistics!

Statistics!. Used to make sense of data. Monty Python Clip: http://www.youtube.com/watch?v=rzcLQRXW6B0&NR=1 What is the velocity of a European Swallow ?. Pictures from: http://www.style.org/unladenswallow/. European Swallow ( Hirundo rustica. South African Swallow ( Hirundo spilodera ).

quang
Download Presentation

Statistics!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics! • Used to make sense of data. Monty Python Clip: http://www.youtube.com/watch?v=rzcLQRXW6B0&NR=1 • What is the velocity of a European Swallow?

  2. Pictures from: http://www.style.org/unladenswallow/ European Swallow(Hirundorustica South African Swallow(Hirundospilodera)

  3. Air speed of European Swallow • Data set 1: • 15. m/s • 7.0 m/s • 6.0 m/s • 15. m/s • 12. m/s • Data derived from Jonathan Coram, http://www.style.org/unladenswallow/ • The mean is the average value for the data set • Calculate the mean value for data set 1…

  4. Data set 2 • 10. m/s • 11. m/s • 12. ms • 11. m/s • 11 m/s • Calculate the mean value for data set 2.

  5. Mean • The mean is the average value for the data set • Both data set 1 and 2 have a mean value of: 11 m/s • Are the data sets the same? • NO! Data set 1 is much more variable than 2.

  6. 1.1.1 State that error bars are a graphical representation of the variability of data. • Error bars can be used to show either the range of the data or the standard deviation. • The range indicates the spread from the lowest value to the highest for a set of data. • Graph: http://www.csupomona.edu/~jcclark/classes/bio542l/graphics/g-error2.gif

  7. Range • Data set 1 includes values from 6 – 15 m/s (range = 9 m/s) • Data set 2 includes values from 10 – 12 m/s (range = 2 m/s)

  8. Standard Deviation • 1.1.3 State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of the values fall within one standard deviation of the mean. • For normally distributed data, about 68% of all values lie within ―1 standard deviation of the mean. This rises to about 95% for ―2 standard deviations. • Picture: http://blog.home-account.com/wp-content/uploads/2009/06/deviation.jpg

  9. 1.1.2 Calculate the mean and standard deviation of a set of values. • Students should specify the standard deviation (s), not the population standard deviation. Students will not be expected to know the formulas for calculating these statistics. They will be expected to use the standard deviation function of a graphic display or scientific calculator. • TI 83: http://www.csis.ysu.edu/~chang/class/TIcalculatorStat.pdf

  10. You Don’t need to know this: • Standard deviation = Σ = Sum ofX = Individual score M = Mean of all scores N = Sample size (Number of scores) • Variance:    Variance = s2 • Population Standard deviation = • Pictures and text from: • http://easycalculation.com/statistics/learn-standard-deviation.php

  11. Here is an online SD calculator • The standard deviation for data set 1 is 4.3 m/s • The standard deviation for data set 2 is 0.71 m/s • Even though the means are identical, the data is very different in regard to its variability.

  12. Why Calculate Standard Deviation? • 1.1.4 Explain how the standard deviation is useful for comparing the means and the spread of data between two or more samples. • A small standard deviation indicates that the data is clustered closely around the mean value. Conversely, a large standard deviation indicates a wider spread around the mean.

  13. Differences between data sets • http://www.youtube.com/watch?v=y2R3FvS4xr4 • Let’s say that you measure velocity of African Swallows, and get the following data: • 14 m/s, 18 m/s, 12. m/s, 14 m/s 17 m/s • What is the mean and standard deviation? • Mean = 15 m/s • Standard deviation = 2.4 m/s • Is this statistically different from that of the European?

  14. What do error bars suggest? • If the bars show extensive overlap, it is likely that there is not a significant difference between those values

  15. T-Tests • Used to tell whether there is a statistically significant difference between two sets of data. • If your error bars overlap it is a good idea to perform a t test, but you need to have enough data points to do so (min 10 each set of data). • Generally differences are considered statistically different if there is a 95% or greater chance that the data sets are different. • Run T-Tests comparing whether the data for the African Sparrow is statistically different from data sets 1 and then set 2 for the European sparrow.

  16. Special Disclaimer • In reality, both the African and the European Swallows have a velocity of approximately 11 m/s. • T test on TI 83: • http://www.csc.villanova.edu/~ysp/Teacher/Webpages/mstp_payne/Web_project/TI-83/2t-test.html

  17. T- test of data sets 1, 2, and 3 • European 1 vs. African • p = 0.1084 • T = 1.807 • df = 8 • European 2 vs. African • p = 0.0080 • T = 3.58 • df = 8

  18. 1.1.5 Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables. • For the t-test to be applied, the data must have a normal distribution and a sample size of at least 10. The t-test can be used to compare two sets of data and measure the amount of overlap. • Students will not be expected to calculate values of t. Only a two-tailed, unpaired t-test is expected.

  19. Correlation Vs. Causation • Correlation: there is a statistically significant similarity between two sets of data. • Causation: changes in one variable cause the other variable to change.

  20. 1.1.6 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables. • Aim 7: While calculations of such values are not expected, students who want to use r and r2values in their practical work could be shown how to determine such values using a spreadsheet program.

  21. r = linear correlation coefficient • Measures the strength and direction of correlation between two variables. • +/- 1 is a linear relationship; 0 is no correlation • - is a negative correlation; + is a positive • A correlation > 0.8 is considered strong; < 0.5 weak. • Formula: http://mathbits.com/mathbits/tisection/statistics2/correlation.htm

  22. Pictures: http://www.math.upenn.edu/~estorm/115s08/bestfitline/bestfitline.html A B C D E F

  23. r2 coefficient of determination • Tells how much of the fluctuation in one variable can be predicted by the other variable. • if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation).  The other 15% of the total variation in y remains unexplained. • The coefficient of determination is a measure of how well the regression line represents the data.  If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation. The further the line is away from the points, the less it is able to explain. • Points above quoted directly from: http://mathbits.com/mathbits/tisection/statistics2/correlation.htm

  24. According to this graph, is there an apparent correlation between number of pirates and global temperature? Do pirates cause the Earth to be cooler?From: http://statfail.com/wp-content/uploads/2010/03/graph_pirates_gw.png

More Related