220 likes | 425 Views
Applying the Normal Distribution: Z-Scores. Chapter 3.5 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U. Comparing Data. Consider the following two students: Student 1 MDM 4U, Mr. Lieff, Semester 1, 2004-2005 Mark = 84%, Student 2
E N D
Applying the Normal Distribution: Z-Scores Chapter 3.5 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U
Comparing Data • Consider the following two students: • Student 1 • MDM 4U, Mr. Lieff, Semester 1, 2004-2005 • Mark = 84%, • Student 2 MDM 4U, Mr. Lieff, Semester 2, 2005-2006 • Mark = 83%, • Can we compare the two students fairly when the mark distributions are different?
Mark Distributions for Each Class Semester1, 2004-05 Semester 2, 2005-06 90 50 58 82 98 74 66 40.6 50.4 60.2 70 79.8 89.6 99.4
Comparing Distributions • It is difficult to compare two distributions when they have different characteristics • For example, the two histograms have different means and standard deviations • z-scores allow us to make the comparison
The Standard Normal Distribution • A distribution with a mean of zero and a standard deviation of one X~N(0,1²) • Each element of any normal distribution can be translated to the same place on a Standard Normal Distribution using the z-score of the element • the z-score is the number of standard deviations the piece of data is below or above the mean • If the z-score is positive, the data lies above the mean, if negative, below
Standardizing • The process of reducing the normal distribution to a standard normal distribution N(0,12) is called standardizing • Remember that a standardized normal distribution has a mean of 0 and a standard deviation of 1
Example 1 • For the distribution X~N(10,2²) determine the number of standard deviations each value lies above or below the mean: • a. x = 7 z = 7 – 10 2 z = -1.5 • 7 is 1.5 standard deviations below the mean • 18.5 is 4.25 standard deviations above the mean (anything beyond 3 is an outlier) • b.x = 18.5 z = 18.5 – 10 • 2 • z=4.25
Example continued… 99.7% 95% 34% 34% 13.5% 13.5% 2.35% 2.35% 6 8 10 12 14 16 7 18.5
Standard Deviation • A recent math quiz offered the following data • The z-scores offer a way to compare scores among members of the class, find out how many had a mark greater than yours, indicate position in the class, etc. • mean = 68.0 • standard deviation = 10.9
Example 2: • Suppose your mark was 64 • Compare your mark to the rest of the class • z = (64 – 68.0)/10.9 = -0.37 (using the z-score table on page 398) • We get 0.3557 or 35.6% • So 35.6% of the class has a mark less than or equal to yours
Example 3: Percentiles • The kth percentile is the data value that is greater than k% of the population • If another student has a mark of 75, what percentile is this student in? • z = (75 - 68)/10.9 = 0.64 • From the table on page 398 we get 0.7389 or 73.9%, so the student is in the 74th percentile – their mark is greater than 74% of the others
Example 4: Ranges • Now find the percent of data between a mark of 60 and 80 • For 60: • z = (60 – 68)/10.9 = -0.73 gives 23.3% • For 80: • z = (80 – 68)/10.9 = 1.10 gives 86.4% • 86.4% - 23.3% = 63.1% • So 63.1% of the class is between a mark of 60 and 80
Back to the two students... • Student 1 • Student 2 • Student 2 has the lower mark, but a higher z-score!
Exercises • read through the examples on pages 180-185 • try page 186 #2-5, 7, 8, 10
Mathematical Indices Chapter 3.6 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U
What is an Index? • An index is an arbitrarily defined number that provides a measure of scale • These are used to indicate a value, but do not actually represent some actual measurement or quantity so that we can make comparisons • Interval Data
1) BMI – Body Mass Index • A mathematical formula created to determine whether a person’s mass puts them at risk for health problems • BMI = m = mass(kg), h = height(m) • Standard / Metric BMI Calculator http://nhlbisupport.com/bmi/bmicalc.htm Underweight Below 18.5 Normal 18.5 - 24.9 Overweight 25.0 - 29.9 Obese 30.0 and Above
2) Slugging Percentage • Baseball is the most statistically analyzed sport in the world • A number of indices are used to measure the value of a player • Batting Average (AVG) measures a player’s ability to get on base (hits / at bats) • Slugging percentage (SLG) also takes into account the number of bases that a player earns (total bases / at bats) SLG = where TB = 1B + 2B*2 + 3B*3 + HR*4 and 1B = singles, 2B = doubles, 3B = triples, HR = homeruns
Slugging Percentage Example • e.g. 1B Miguel Cabrera, Detroit Tigers http://sports.yahoo.com/mlb/players/7163 • 2008 Statistics: 616 AB, 180 H, 36 2B, 2 3B, 37 HR SLG = (H + 2B + 2*3B + 3*HR) / AB = (180 + 36 + 2*2 + 3*37) / 616 = 331 / 616 = 0.537 (3 decimal places) • This means Miggy attained 0.537 bases per AB
Example 3: Moving Average • Used when time-series data show a great deal of fluctuation (e.g. long term trend of a stock) • takes the average of the previous n values • e.g. 5-Day Moving Average • cannot calculate until the 5th day • value for Day 5 is the average of Days 1-5 • value for Day 6 is the average of Days 2-6 • e.g. Look up a stock symbol at http://ca.finance.yahoo.com • Click Charts Technical chart • n-Day Moving Average
MSIP / Homework • read pp. 189-192 • Complete pp. 193-195 #1a (odd), 2-3 ac, 4 (alt: calculate SLG for 3 players on your favourite team for 2008), 8, 9, 11
References • Halls, S. (2004). Body Mass Index Calculator. Retrieved October 12, 2004 from http://www.halls.md/body-mass-index/av.htm • Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page