150 likes | 433 Views
Welcome to LSP 121. Quantitative Reasoning and Technological Literacy IIContinuation of quantitative reasoning concepts from LSP 120We know (from LSP 120) if data is linear, near-linear, or exponential, now we'll discuss Normal distributions of dataGiven a data set, what is its mean, median, stan
E N D
1. Intro to LSP 121
Normal Distributions LSP 121
2. Welcome to LSP 121 Quantitative Reasoning and Technological Literacy II
Continuation of quantitative reasoning concepts from LSP 120
We know (from LSP 120) if data is linear, near-linear, or exponential, now well discuss Normal distributions of data
Given a data set, what is its mean, median, standard deviation and more (descriptive statistics)
Given two sets of data, is there a correlation
3. Welcome to LSP 121 Continuation of quantitative reasoning concepts from LSP 120
Given a set of data, can we calculate probability and risk
How do we store properly store data so that it can be easily retrieved - databases
How do we manipulate data, compress it, or check for errors - algorithms
If you feel you know this material, take the test
Lets get started!
4. What is a Normal Distribution? Very common, yet very special type of set of data
Most data values are clustered near the mean (a single peak)
Distribution is symmetric
Tapering tales as you move away from the mean
Looks like a bell curve
5. What are the mean and standard deviation? The mean (or average) is the sum of all values divided by the number of values (mean is not the same as the median)
The standard deviation is the widely used measure of variability, or dispersion of data points around the mean. It shows how much variation there is from the mean.
For example, given a set of values (35, 52, 65, 80, 84, 91), the mean is 67.83 and the standard deviation is 21.37. (lets open Excel, enter these values, and compute mean and median)
So if the mean is 67.83, what is one standard deviation above the mean? One standard deviation below the mean?
6. The 68-95-99.7 Rule About 68% (68.3%), or just over 2/3, of the data points fall within 1 standard deviation (+ or -) of the mean
About 95% (95.4%) of the data points fall within 2 standard deviations of the mean
About 99.7% of the data points fall within 3 standard deviations of the mean
8. Example SAT exams were designed to produce normal distributions with a mean of 500 and a standard deviation of 100.
Thus, 68% of the students scored between 400 and 600
95% of the students scored between 300 and 700
99.7% scored between 200 and 800
What if someone scored 720 on the SAT? What percentage of students scored less than or equal to 720?
You try this. Use Excels NORMDIST function
=NORMDIST(X, mean, stdev, true)
For our problem: =NORMDIST(720, 500, 100, TRUE)
Answer = 0.986097, or 98.6097%
What percentage scored <= 650?
What percentage scored > 760?
9. Another Example A survey finds that prices paid for two-year-old Ford Explorers are normally distributed with a mean of $16,500 and a standard deviation of $500. Consider a sample of 10,000 people who bought two-year-old Ford Explorers. How many people paid between $16,000 and $17,000?
=NORMDIST(16000,16500,500,true) yields 0.158655
=NORMDIST(17000, 16500, 500, true) yields 0.841345
Subtract: 0.841345 0.158655 yields 0.682689
10000 x 0.682689 = 6826.89 people
Or use the graph two slides back
10. Another Example How many paid less than or equal to $16,000?
=NORMDIST(16000, 16500, 500, true) yields 0.158655, or 15.8655 %
10000 x 0.158655 = 1586.55
What is another way of saying What percentage of values are less than or equal to some value X? (see next slide)
11. Percentiles The nth percentile of a data set is the smallest value in the set with the property that n% of the data values are less than or equal to it.
At the mean, 50% (or 0.50) of all the values are less than or equal to the mean. The mean is the 50th percentile.
12. Example Cholesterol levels in men 18 to 24 years of age are normally distributed with a mean of 178 and a standard deviation of 41.
In what percentile is a man with a cholesterol level of 190?
Using Excels normdist function:
=normdist(190,178,41,true) returns 0.61, or 61st percentile
13. Standard Scores The number of standard deviations a data value lies above or below the mean is called its Standard Score, or z-score, or simply z.
The standard score of the mean is z=0
The standard score of a data value 1.5 standard deviations above the mean is z=1.5
14. Standard Scores The standard score of a data value 2.4 standard deviations below the mean is z = -2.4
In general:
z = (data value mean) / standard deviation
15. Example The Stanford-Binet IQ test is designed so that scores are normally distributed with a mean of 100 and a standard deviation of 16. What are the z-scores for IQ scores of 95 and 125?
z = (95 - 100) / 16 = -0.31
z = (125 - 100) / 16 = 1.56 Thus, an IQ score of 125 lies 1.56 standard deviations above the mean.
16. Inverse Normal Distribution Function What if you know the mean, standard deviation, and percentile, and want to know the X value?
You can guess using NORMDIST, or better yet use Excels NORMINV
For example, if a set of scores has a mean of 76, a standard deviation of 12, and the percentile of some value is 86 percentile, what is that value? 88.9
Lets try the first activity. You may work alone or in pairs. Homework assignments should be completed on your own. Homework and activity are due no later than 1 week from now.