1 / 53

Exploring Data with Graphs and Summaries

Learn about different types of data, graphs, and numerical summaries essential for comprehensive statistical analysis and interpretation. Understand variables, variation, and methods to summarize data effectively.

creech
Download Presentation

Exploring Data with Graphs and Summaries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 2Exploring Data with Graphs and Numerical Summaries • Learn …. The Different Types of Data The Use of Graphs to Describe Data The Numerical Methods of Summarizing Data

  2. Section 2.1 What are the Types of Data?

  3. In Every Statistical Study: • Questions are posed • Characteristics are observed

  4. Characteristics are Variables A Variable is any characteristic that is recorded for subjects in the study

  5. Variation in Data • The terminology variablehighlights the fact that data values vary.

  6. Example: Students in a Statistics Class • Variables: • Age • GPA • Major • Smoking Status • …

  7. Data values are called observations • Each observation can be: • Quantitative • Categorical

  8. Categorical Variable • Each observation belongs to one of a set of categories • Examples: • Gender (Male or Female) • Religious Affiliation (Catholic, Jewish, …) • Place of residence (Apt, Condo, …) • Belief in Life After Death (Yes or No)

  9. Quantitative Variable • Observations take numerical values • Examples: • Age • Number of siblings • Annual Income • Number of years of education completed

  10. Graphs and Numerical Summaries • Describe the main features of a variable • For Quantitative variables: key features are center and spread • For Categorical variables: key feature is the percentage in each of the categories

  11. Quantitative Variables • Discrete Quantitative Variables and • Continuous Quantitative Variables

  12. Discrete • A quantitative variable is discrete if its possible values form a set of separate numbers such as 0, 1, 2, 3, …

  13. Examples of discrete variables • Number of pets in a household • Number of children in a family • Number of foreign languages spoken

  14. Continuous • A quantitative variable is continuous if its possible values form an interval

  15. Examples of Continuous Variables • Height • Weight • Age • Amount of time it takes to complete an assignment

  16. Frequency Table • A method of organizing data • Lists all possible values for a variable along with the number of observations for each value

  17. Example: Shark Attacks

  18. Example: Shark Attacks Example: Shark Attacks • What is the variable? • Is it categorical or quantitative? • How is the proportion for Florida calculated? • How is the % for Florida calculated?

  19. Example: Shark Attacks • Insights – what the data tells us about shark attacks

  20. Identify the following variable as categorical or quantitative: Choice of diet (vegetarian or non-vegetarian): • Categorical • Quantitative

  21. Identify the following variable as categorical or quantitative: Number of people you have known who have been elected to political office: • Categorical • Quantitative

  22. Identify the following variable as discrete or continuous: The number of people in line at a box office to purchase theater tickets: • Continuous • Discrete

  23. Identify the following variable as discrete or continuous: The weight of a dog: • Continuous • Discrete

  24. Section 2.2 How Can We Describe Data Using Graphical Summaries?

  25. Graphs for Categorical Data • Pie Chart: A circle having a “slice of pie” for each category • Bar Graph: A graph that displays a vertical bar for each category

  26. Example: Sources of Electricity Use in the U.S. and Canada

  27. Pie Chart

  28. Bar Chart

  29. Pie Chart vs. Bar Chart • Which graph do you prefer? • Why?

  30. Graphs for Quantitative Data • Dot Plot: shows a dot for each observation • Stem-and-Leaf Plot: portrays the individual observations • Histogram: uses bars to portray the data

  31. Example: Sodium and Sugar Amounts in Cereals

  32. Dotplot for Sodium in Cereals • Sodium Data: 0 210 260 125 220 290 210 140 220 200 125 170 250 150 170 70 230 200 290 180

  33. Stem-and-Leaf Plot for Sodium in Cereal Sodium Data: 0 210 260 125 220 290 210 140 220 200 125 170 250 150 170 70 230 200 290 180

  34. Frequency Table Sodium Data: 0 210 260 125 220 290 210 140 220 200 125 170 250 150 170 70 230 200 290 180

  35. Histogram for Sodium in Cereals

  36. Which Graph? • Dot-plot and stem-and-leaf plot: • More useful for small data sets • Data values are retained • Histogram • More useful for large data sets • Most compact display • More flexibility in defining intervals

  37. Shape of a Distribution • Overall pattern • Clusters? • Outliers? • Symmetric? • Skewed? • Unimodal? • Bimodal?

  38. Symmetric or Skewed ?

  39. Example: Hours of TV Watching

  40. Identify the minimum and maximum sugar values:

  41. Consider a data set containing IQ scores for the general public: What shape would you expect a histogram of this data set to have? • Symmetric • Skewed to the left • Skewed to the right • Bimodal

  42. Consider a data set of the scores of students on a very easy exam in which most score very well but a few score very poorly: What shape would you expect a histogram of this data set to have? • Symmetric • Skewed to the left • Skewed to the right • Bimodal

  43. Section 2.3 How Can We describe the Center of Quantitative Data?

  44. Mean • The sum of the observations divided by the number of observations

  45. Median • The midpoint of the observations when they are ordered from the smallest to the largest (or from the largest to the smallest)

  46. Find the mean and median CO2 Pollution levels in 8 largest nations measured in metric tons per person: 2.3 1.1 19.7 9.8 1.8 1.2 0.7 0.2 • Mean = 4.6 Median = 1.5 • Mean = 4.6 Median = 5.8 • Mean = 1.5 Median = 4.6

  47. Outlier • An observation that falls well above or below the overall set of data • The mean can be highly influenced by an outlier • The median is resistant: not affected by an outlier

  48. Mode • The value that occurs most frequently. • The mode is most often used with categorical data

  49. Section 2.4 How Can We Describe the Spread of Quantitative Data?

  50. Measuring Spread: Range • Range: difference between the largest and smallest observations

More Related