460 likes | 467 Views
This talk provides a clear explanation of the central limit theorem, its importance, and how to incorporate it into statistics classes. It includes in-class simulations and examples to make the concept more intuitive.
E N D
The Central Limit Theorem: “What is it and how do I present it?”
Outline • National Standards and the Design of My Talk • Available resources • Description of central limit theorem • In-class simulations • Examples • Why is the central limit theorem so important? • Conclusions Dr. Derek Webb Bemidji State University
National Standards and the Design of My Talk • Central limit theorem not specifically addressed outside AP course. • Central limit theorem indirectly encountered and discussed in many contexts including simulations, distribution discussions and inferential statistics • My talk provides teachers with background and enables them to incorporate the central limit theorem into their classes if they wish. Dr. Derek Webb Bemidji State University
National Standards and the Design of My Talk • I attended the National Council of Teachers of Mathematics Academy on Probability and Statistics in Feb. 2004. • Central limit theorem was brought up many times by facilitators and teachers. • Most had a vague (sometimes correct, sometimes incorrect) understanding of the central limit theorem. Dr. Derek Webb Bemidji State University
What is out there? • Books emphasize the theoretical Examples and are brief and unmotivating. • Google search of “central limit theorem” yielded 69,700 hits. • Pages I visited are technical, complicated, and confusing. • Examples are not very intuitive. Dr. Derek Webb Bemidji State University
Historical Perspective • 1749-1827 • Lived in France • First person to study central limit theorem in depth and give a proof of its validity. Pierre-Simon Laplace Dr. Derek Webb Bemidji State University
Historical Perspective • 1777-1855 • Lived in Germany • Did extensive work with the normal distribution and probability theory. Karl Frederick Gauss Dr. Derek Webb Bemidji State University
What is the Central Limit Theorem? Dr. Derek Webb Bemidji State University
The Central Limit Theorem is not a result about individual observations Individual observations of a random sample: Dr. Derek Webb Bemidji State University
The Central Limit Theorem describes the properties of the following two quantities as n gets larger The sample mean or average of a random sample of size n: The sum or total of the sample of size n: Dr. Derek Webb Bemidji State University
In-Class Simulations • Can take lots of time • Many samples necessary • Students may lose track of “big picture” • Multiple class days may be necessary Dr. Derek Webb Bemidji State University
Example Using Sample Mean 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Population: 65 introductory statistics students. • Variable: age Dr. Derek Webb Bemidji State University
Example Using Sample Mean • Displaying the population using a histogram • Population skewed right Dr. Derek Webb Bemidji State University
Example Using Sample Mean • Displaying the population using a dot plot • Population skewed right Dr. Derek Webb Bemidji State University
Sample of Size 1 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Take a random sample of size n=1 from the population. • Sample mean of sample of size n=1 is Dr. Derek Webb Bemidji State University
Sample of Size 1 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Take a random sample of size n=1 from the population. • Sample mean of sample of size n=1 is Dr. Derek Webb Bemidji State University
Sample of Size 1 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Take a random sample of size n=1 from the population. • Sample mean of sample of size n=1 is Dr. Derek Webb Bemidji State University
Distribution of • There are 65 possible samples of size n=1. • Therefore, there are 65 values of . • The distribution of all possible values of is called the sampling distribution of . Dr. Derek Webb Bemidji State University
Sample of Size 2 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Take a random sample of size n=2 from the population. • Sample mean of sample of size n=2 is Dr. Derek Webb Bemidji State University
Sample of Size 2 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Take a random sample of size n=2 from the population. • Sample mean of sample of size n=2 is Dr. Derek Webb Bemidji State University
Distribution of • There are 2080 possible samples of size n=2. • Therefore, there are 2080 values of . • The distribution of all possible values of is called the sampling distribution of . Dr. Derek Webb Bemidji State University
Sample of Size 3 17 18 18 18 19 19 19 19 20 20 20 20 20 20 20 20 20 20 21 21 21 21 21 21 21 21 21 21 22 22 22 22 22 22 22 22 22 23 23 23 23 23 24 24 24 24 24 25 25 25 26 26 27 27 29 29 30 32 32 33 33 34 38 43 50 • Take a random sample of size n=3 from the population. • Sample mean of sample of size n=3 is Dr. Derek Webb Bemidji State University
Distribution of • There are 43,680 possible samples of size n=3. • Therefore, there are 43,680 values of . • The distribution of all possible values of is called the sampling distribution of . Dr. Derek Webb Bemidji State University
Distribution of • The number of possible samples increases very quickly as n increases. • For n=5 there are 8,259,888 possible samples. Dr. Derek Webb Bemidji State University
Distribution of • Sampling distribution of for n=10. • Note: shape starting to look very symmetric. • Note: range of sample mean is decreasing. • Shape of the normal distribution appearing. Dr. Derek Webb Bemidji State University
Distribution of - Summary Dr. Derek Webb Bemidji State University
Distribution of - Summary • The original distribution of ages is very skewed and disjoint. • The distribution of possible values is called the sampling distribution of . • As sample size n increases the sampling distribution of becomes symmetrical. • The larger the sample size, the more closely the distribution resembles the normal distribution. Dr. Derek Webb Bemidji State University
Distribution of - Summary • The mean of the sampling distribution of equals the mean of the original distribution. The mean of the sampling distribution equals the mean of the original distribution Dr. Derek Webb Bemidji State University
Distribution of - Summary Mean = 23.9 Mean = 23.9 Mean = 23.9 Mean = 23.9 Dr. Derek Webb Bemidji State University
Distribution of - Summary • The standard deviation (spread) of the sampling distribution of equals the standard deviation of the original distribution divided by the square root of the sample size. The standard deviation of the sampling distribution decreases as n increases. Dr. Derek Webb Bemidji State University
Distribution of - Summary The standard deviation of the sampling distribution decreases as n increases. Standard Deviation = 5.88 Dr. Derek Webb Bemidji State University
Distribution of - Summary The standard deviation of the sampling distribution decreases as n increases. Standard Deviation = 2.63 Dr. Derek Webb Bemidji State University
Distribution of - Summary The standard deviation of the sampling distribution decreases as n increases. Standard Deviation = 1.86 Dr. Derek Webb Bemidji State University
Distribution of - Summary The standard deviation of the sampling distribution decreases as n increases. Standard Deviation = 1.52 Dr. Derek Webb Bemidji State University
Example Using the Sample Sum: T • Recall • Population: U.S. adults • Sample: Full flight on a Boeing 747 • Boeing 747 holds 358 passengers • Average weight of U.S. male: 172 lbs • Average weight of U.S. female: 143 lbs • Percent of U.S. population that is male: 49.1% • Percent of U.S. population that is female: 50.9% Dr. Derek Webb Bemidji State University
Distribution of Population • Combination of two distributions, one for females and one for males. Dr. Derek Webb Bemidji State University
Distribution of T • Distribution of Total passenger weight for n=5 • Mean of distribution is 787.35 pounds • Red lines indicate middle 95% of distribution which is (681 lbs, 894 lbs) Dr. Derek Webb Bemidji State University
Distribution of T • Distribution of Total passenger weight for n=10 • Mean of distribution is 1574.8 pounds • Red lines indicate middle 95% of distribution which is (1424 lbs, 1725 lbs) Dr. Derek Webb Bemidji State University
Distribution of T • Distribution of Total passenger weight for n=100 • Mean of distribution is 7.8 tons • Red lines indicate middle 95% of distribution which is (7.6 tons, 8.1 tons) Dr. Derek Webb Bemidji State University
Distribution of T • Distribution of Total passenger weight for n=358 • Mean of distribution is 28.2 tons • Red lines indicate middle 95% of distribution which is (27.7 tons, 28.6 tons) Dr. Derek Webb Bemidji State University
Distribution of T - Summary • The original distribution of potential adult passengers is complex and non-normal. • The distribution of possible T values is called the sampling distribution of T. • The larger the sample size, the more closely the distribution of T resembles the normal distribution. Dr. Derek Webb Bemidji State University
Distribution of T - Summary • The mean of the sampling distribution of T equals n times the mean of the original distribution. • The mean of the original distribution is pounds. • The mean of the distribution of T for n=358 is Dr. Derek Webb Bemidji State University
Distribution of T - Summary • The standard deviation (spread) of the sampling distribution of T equals the square root of the sample size times the standard deviation of the original distribution. • The standard deviation of the original distribution is • The standard deviation of the distribution of T for n=358 is Dr. Derek Webb Bemidji State University
Why is the Central Limit Theorem so Important? • Most populations we study are not normally distributed. • Elementary hypothesis test and confidence interval methods require a normally distributed population OR large sample size ( ). • These methods are frequently used and oftentimes the only methods students learn. Dr. Derek Webb Bemidji State University
Why is the Central Limit Theorem so Important? • Teachers and students may notice that a simulation exercise in class results in a histogram that looks normal. • Teachers need to be aware of why it looks normal – ie Teachers need to understand the central limit theorem and know when it shows up in simulation exercises. Dr. Derek Webb Bemidji State University
Questions? Dr. Derek Webb Bemidji State University