310 likes | 321 Views
Statistics . Math 416. Game Plan. Introduction Census / Poll / Survey Population – Sample – Bias Sample Proportion Mean Median Mode Box and Whisker Plot Box and Whisker Interpretation. Stats Intro. There are lies, there are damn lies and then there are statistics
E N D
Statistics Math 416
Game Plan • Introduction • Census / Poll / Survey • Population – Sample – Bias • Sample Proportion • Mean Median Mode • Box and Whisker Plot • Box and Whisker Interpretation
Stats Intro • There are lies, there are damn lies and then there are statistics - Mark Twain • The goal is by the use of number describe a characteristic of a population. • The idea is to win your argument by providing facts and too many people consider statistics to be absolute facts.
Stats Intro • In general, most people do not understand statistics. • Hypothesis: Student A has a school average of 10% • Conclusion: Student A is a bad person. • The statistic does not measure the person’s goodness or badness. • What does that statistic mean? • If all there marks were the same for all courses, it would be 10%
Statistics • Life is a continual battle to get your ideas across and have other people trying to get their ideas across to you. • You are constantly being bombarded by arguments and statistics. • Commercials • Teachers • To understand the world around you, need to be aware of statistics meaning and reliability. • Where do statistics come from?
Population • First we establish the population. • Population: the complete group that we are investigating • Characteristic: A particular identifying object exhibited by the population i.e. hair colour favorite colour math knowledge, political opinion etc.
Population • The next problem is interpreting how to measure a characteristic and obtain the data. • Obtaining the Data: Three methods Method #1: Ask the whole population • Called a census • Problems – hard to do – depending on population
Census • Method #1: Ask the whole population • Called a census • Problems – hard to do – depending on population
Poll • Method #2: - Ask a representative “sample” of the population - Called a poll Problems: representative may be tricky
Survey • Method #3: Ask only experts of the population - called a survey Problems: who is an expert Representative sample?
Bias Bias • If data is obtained or presented in an unfair manner than all conclusions are not correct. The results are said to be biased (or unfair). In collection • How and who you ask is the main source of bias • There are 4 types of bias (bad sampling, non pertinence, wording of question & attitude of pollster).
Bias Eg asking 5 yr olds their favorite beer • Bad sampling Eg Do you like to play an instrument? (to find favorite color) • Non-pertinence Eg man I hate Bush, are you in favor of war? • Wording of questions Eg a policeman asking were you speeding? • Attitude of pollster
Presentation Bias • In presentation, imagine you disregard a grade level and claim that they do not matter in a school’s decision. • I need to prove my product is the best, how can I get these numbers to show that?
Buy This Stock! Stencil #1-3 $500 $400 $300 Not! $200 $100 A statistical presentation is always biased $0 Jan Feb March April Jan
Representative Sample • Creating a representative sample can be an art form in itself. The sample should be in all the same proportions, an impossibility. • You must focus on the characteristics (the poll or survey is focusing on!)
Representative Sample • Consider a school has 50 boys and 25 girls and a representative sample of 10 needs to be created. • We note the population is described in terms of boys and girls hence we will need to create our sample on that basis • Three steps
Representative Sample 1) Relative (by percent) n = 75 50/75 = 67% 25/75 = 33% 2) Theory - sample = 10 10x.67=6.7 10x.33 = 3.3 Difficult to get .7 or .3 of a person! 3) Reality 7 3 Total of 10 & has added bias
Some Rules • If it starts at zero it stays at zero • If it appears to be zero be careful! • Make decisions on a category not overall
Creating a Sample • Given the following, create a sample of 10 Hudson Non-Hudson Young 0 1 n = 109 Middle Aged 20 24 Old 31 33 Relative Y 0 0 Is it really 0 people? MA 18% 22% 0 28% 30%
Creating a Sample Stencil 4,5, 6 Do relative, theory and reality for #4; in #5 & 6 put theory & relative together Theory Y 0 0 MA 1.8 2.2 O 2.8 3 Reality Y open here 0 0 MA 2 2 3 O 3
Statistics Central Tendency - Mean • Mean means the average Symbol x Found by dividing the sum ∑xi by the number of elements n. i.e. x = ∑ xi n Means which value would all values be equal to if they were the same i.e. (5,9,3,6) x = ∑ xi = (5+9+3+6)/4 = 5.75 n
Mode • Symbol M • It is the number that appears the most • It is possible, not to have any or to have more than one mode • Eg (1,2,5) • Eg (1,6,6,8) • Eg (1,3,3,4,4,8) M = (nothing repeats) M = 6 M = 3 & 4
Median Do #7 • Symbol M • Median is found as the middle value • Note the sample must be in order! • There are two possibilities (odd & even) • Consider (1,5,7) n = 3 • Odd only 1 middle; M = 5 • (1,5,7,8) n = 4 • You must find the mean of both middles (5 + 7)/2 = 6
Box & Whisker Plot • (2,5,1,6,9,8,) • The Construction • 1) Make sure your sample is in order • (1, 2,5,6,8,9) • 2) Find the min, max & median • Min = 1 ; max = 9 median = 5.5 = Q2 • These three points will serve you as part of the box and whisker diagram. Draw it on board…
Box & Whisker Plot (1,2,5,6,8,9) -0 1 2 3 4 5 6 7 8 9 10 11 -1 • 3) Create a number line with vertical line at the three points hinges • 4) Find median between min and Q2 called Q1 It is 2 and make another hinge 5) Find median between Q2 and max called Q3 It is 8 & make another hinge. Complete it!
We have broken the data into four parts called quartiles. Words & Facts Box Max Min Q3 Q1 Q2 Whiskers Whiskers Interquartile range = Q3-Q1
Words & Facts • Each quartile should hold about ¼ of the data BUT you cannot be sure • You cannot tell the mean or the mode • Do not jump to conclusions! • A box and whisker gives you an idea about the spread or concentration or dispersion of data
Example #1 1 2 3 4 5 6 7 8 9 10 11 12 • A general View This data is very close together below 4. There is more of a spread between 4 and 11 and once again between 11 and 12. Some Questions… What is the Mean? No idea
Questions No idea • What is the mode? • What is the median? • What is the interquartile range? • How many are below 11? Q2 = 4 Q3-Q1 = 11-3 = 8 75% but no idea of the number The lowest concentration of numbers lie where? Between 4 - 11 Lowest concentration vs. highest concentration
Example #2 n = 20 Class A n = 40 Class B 100 30 40 50 60 70 80 90 a) Which class did better? Hard to tell but class A b) What are the means No idea c) All together approximately how many were over 60% ¾ x 20 + 2/4 x 40 = 35
Example #2 – More Questions • Which class and which mark was the highest? • Class B at approximately 97% • Which class has lowest range? • Class A • 87-55 = 32 • Class B • 97-40 = 57 • Answer: Class A • Finish Stencil