200 likes | 277 Views
Ways to look at the data. Box plot. Dot plot. Histogram. Number of hurricanes that occurred each year from 1944 through 2000 as reported by Science magazine. 3 Characteristics of data. Shape Center Spread. Shape of the data – Symmetric.
E N D
Ways to look at the data Box plot Dot plot Histogram Number of hurricanes that occurred each year from 1944 through 2000 as reported by Science magazine
3 Characteristics of data Shape Center Spread
Shape of the data – Symmetric The age of all US Presidents at the time they took office Notice that this distribution has only one mode
Shape of the data – Bimodal The winning times in the Kentucky Derby from 1875 to the present. Why two modes?
Shape of the data – Bimodal The winning times in the Kentucky Derby from 1875 to the present. Why two modes? The length of the track was reduced from 1.5 miles to 1.25 miles in 1896. The race officials thought that 1.5 miles was too far.
Shape of the data – skewed LEFT RIGHT Data for two different variables for all female heart attack patients in New York state in one year. One is skewed left; the other is skewed right. Which is which?
Center and Spread of Data Maximum Q3 Median Q1 Minimum 100th percentile 75th percentile 50th percentile 25th percentile 0th percentile These numbers are called the 5 number summary. The median measures the center of the data. Q3 – Q1 = Interquartile range (IQR) measures the spread.
Symbols: s2 = Sample Variance s = Sample Standard Deviation 2 = Population Variance (Pop. St. Dev. Squared) = Population Standard Deviation (Sq. Root of Variance) REMEMBER-The Variance is the SD squared! And the SD is the Sq. root of the Variance! x = Mean Symbols --
The normal distribution and standard deviations 34% 34% 2.35% 2.35% 13.5% 13.5% In a normal distribution: The total area under the curve is 1.
The normal distribution and standard deviations In a normal distribution: Approximately 68% of scores will fall within one standard deviation of the mean
The normal distribution and standard deviations In a normal distribution: Approximately 95% of scores will fall within two standard deviations of the mean
On another test, a standard deviation may equal 5 points. If the mean were 45, then 68% of the students would score from 40 to 50 points. 30 35 40 45 50 55 60 Points on a Different Test The number of points that one standard deviations equals varies from distribution to distribution. On one math test, a standard deviation may be 7 points. If the mean were 45, then we would know that 68% of the students scored from 38 to 52. 2.35% 13.5% 34% 34% 13.5% 2.35% • 31 38 45 52 59 63 • Points on Math Test 2.35% 13.5% 34% 34% 13.5% 2.35%
Using standard deviation units to describe individual scores Here is a distribution with a mean of 100 and standard deviation of 10: 80 90 100 110 120 -2 sd -1 sd 1 sd 2 sd What score is one sd below the mean? 90 120 What score is two sd above the mean?
Using standard deviation units to describe individual scores Here is a distribution with a mean of 100 and standard deviation of 10: 80 90 100 110 120 -2 sd -1 sd 1 sd 2 sd 1 How many standard deviations below the mean is a score of 90? How many standard deviations above the mean is a score of 120? 2
Using standard deviation units to describe individual scores Here is a distribution with a mean of 100 and standard deviation of 10: 80 90 100 110 120 -2 sd -1 sd 1 sd 2 sd What percent of your data points are < 80? 2.50% What percent of your data points are > 90? 84%
Types of Sampling:Self-selected Sample • This methods allows the sample to choose themselves by responding to a general appeal (volunteering to be surveyed). • Examples of Self-selected Sample: a call-in radio poll, an internet poll on a website • Problems with Self-selected samples: bias – because people with strong opinions on the topic (especially negative opinions) are most likely to respond.
Convenience Sampling • In a convenience sample individuals are chosen because they are easy to reach. • Example: People conducting a survey go to the mall and stop people who are shopping. This is convenient for the person doing the survey but does not guarantee that the sample is representative of the population of the study. • Convenience sampling also involves bias on the part of the interviewer.
Random Samples • A random sample of size “n” individuals from the population chosen in such a way that every set of “n” individuals has an equal chance to be the sample selected. • Example: Putting everyone’s name in a hat and drawing 3 names to participate in the study.
Systematic Sample • When a rule is used to select members of the population. • Ex. Every third person on an alphabetized list
Stratified Random Sample To select a stratified random sample, first divide the population into groups of similar individuals, called STRATA. Then choose a separate sample in each strata and combine these to form the full sample. Common example would be separating by gender or race first, then selecting samples from each group.