220 likes | 437 Views
Introduction to Statistics. Problems in Statistics. A company took the blood pressure of 1000 people of various ages to see if blood pressure increases with age. The weather forecast predictions were compared with the actual weather to see how accurate weather predictions are.
E N D
Problems in Statistics • A company took the blood pressure of 1000 people of various ages to see if blood pressure increases with age. • The weather forecast predictions were compared with the actual weather to see how accurate weather predictions are. • A pollster interviews a certain number of voters to predict who will win an upcoming election. • A city planning employee records the number of cars that pass through an intersection every hour to determine if a light should be placed there.
What is Statistics? • Statistics is the science of collecting, simplifying, and describing data, as well as making inferences (drawing conlusions) based on the analysis of data. • Data valuesorobservationsare the rawmaterials of statistics. Theyarenumbers in contexte.g. the number of thosepolledages 30-49 withbloodpressure 91 or the number of carspassingthrough the intersection at 3:00 pm
Viewing Data For all intents and (intensive?) purposes, data is meaningless if it cannot be interpreted. We present several ways to “see” the data. Depending on the data, some ways of displaying the data are more beneficial than others.
Consider the following “data” • No context, no units- the data is meaningless.
This gives a context to the data, but it might not give any kind of insight.
Things to look for Shape Center Spread Outliers Symmetric, skewed to the right or left Not all of these will be applicable to all graphical displays.
Stem and Leaf Plots With bar graph and pie chart, we were interested in both the value and the identity of the object which gave that value. This information may sometimes be either superfluous or confidential. Consider the midterm grades of a class I taught years ago. 81, 89, 82, 82, 79, 85, 76, 54, 75, 75, 78, 71, 83, 88, 52, 86, 89, 89, 84, 79, 80, 85.
Stem and Leaf Stem Leaf This data skews to the right and clusters in the 70-89 range. 5 6 7 8 9 2 4 1 5 5 6 8 9 9 0 1 2 2 3 4 5 5 6 8 9 9 9 Should 52 and 54 both be considered outliers?
Histogram Unlike a bar graph which displays categorical data, a histogram displays numerical data. We may consider GPA distribution of 20 students with GPAs 3.1, 2.7, 3.2, 2.9, 2.8, 3.1, 3.3, 2.8, 2.9, 3.2, 2.5, 3.9, 3.8, 2.4, 2.7, 2.8, 3.9, 2.6, 3.1, and 3.1
Time Plots A time plot plots an observation against the time it was measured. A pattern that repeats itself at regular intervals is a seasonal variation. We can graph the working hours per week over the years in the United States (www.gapminder.org)