1 / 21

Picturing Distributions with Graphs

Objectives (BPS chapter 1). Picturing Distributions with Graphs Individuals and variablesTwo types of data: categorical and quantitativeWays to chart categorical data: bar graphs and pie chartsWays to chart quantitative data: histogramsInterpreting histogramsWays to chart quantitative data:

kerem
Download Presentation

Picturing Distributions with Graphs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Picturing Distributions with Graphs BPS chapter 1

    2. Objectives (BPS chapter 1) Picturing Distributions with Graphs Individuals and variables Two types of data: categorical and quantitative Ways to chart categorical data: bar graphs and pie charts Ways to chart quantitative data: histograms Interpreting histograms Ways to chart quantitative data: stem plots Time plots

    3. Individuals and variables Individuals are the objects described by a set of data. Individuals may be people, but they may also be animals or things. Example: Freshmen, 6-week-old babies, golden retrievers, fields of corn, cells A variable is any characteristic of an individual. A variable can take different values for different individuals. Example: Age, height, blood pressure, ethnicity, leaf length, first language

    4. Two types of variables A variable can be either quantitative Something that can be counted or measured for each individual and then added, subtracted, averaged, etc., across individuals in the population. Example: How tall you are, your age, your blood cholesterol level, the number of credit cards you own. or categorical Something that falls into one of several categories. What can be counted is the count or proportion of individuals in each category. Example: Your blood type (A, B, AB, O), your hair color, your ethnicity, whether you paid income tax last tax year or not.

    5. How do you decide if a variable is categorical or quantitative? Ask: What are the n individuals/units in the sample (of size “n”)? What is being recorded about those n individuals/units? Is that a number (? quantitative) or a statement (? categorical)?

    6. Ways to chart categorical data Because the variable is categorical, the data in the graph can be ordered any way we want (alphabetical, by increasing value, by year, by personal preference, etc.). Bar graphs Each category is represented by a bar.

    7. Ways to chart categorical data Show the categorical variable as a pie whose slices are sized by counts or percents of the whole.). Pie Charts Each category is represented by a slice.

    8. Example: Top 10 causes of death in the United States, 2001

    11. Child poverty before and after government intervention—UNICEF, 1996

    12. Ways to chart quantitative data Histograms This is a summary graph for a single variable. It’s very useful to understand the pattern of variability in the data. Line graphs: time plots Use when there is a meaningful sequence, like time. The line connecting the points helps emphasize any change over time. Other graphs to reflect numerical summaries (see chapter 2)

    13. Histograms The range of values that a variable can take is divided into equal-size intervals. The histogram shows the number of individual data points that fall in each interval.

    14. How to create a histogram It is an iterative process—try and try again. What bin size should you use? Not too many bins with either 0 or 1 counts Not overly summarized that you lose all the information Not so detailed that it is no longer summary

    16. Interpreting histograms When describing a quantitative variable, we look for the overall pattern and for striking deviations from that pattern. We can describe the overall pattern of a histogram by its shape, center, and spread.

    17. Most common distribution shapes A distribution is symmetric if the right and left sides of the histogram are approximately mirror images of each other.

    18. Outliers An important kind of deviation is an outlier. Outliers are observations that lie outside the overall pattern of a distribution. Always look for outliers and try to explain them. This is from the book. Imagine you are doing a study of health care in the 50 US states, and need to know how they differ in terms of their elderly population. This is a histogram of the number of states grouped by the percentage of their residents that are 65 or over. You can see there is one very small number and one very large number, with a gap between them and the rest of the distribution. Values that fall outside of the overall pattern are called outliers. They might be interesting, they might be mistakes - I get those in my data from typos in entering RNA sequence data into the computer. They might only indicate that you need more samples. Will be paying a lot of attention to them throughout class both for what we can learn about biology and also because they can cause trouble with your statistics. Guess which states they are (florida and alaska).This is from the book. Imagine you are doing a study of health care in the 50 US states, and need to know how they differ in terms of their elderly population. This is a histogram of the number of states grouped by the percentage of their residents that are 65 or over. You can see there is one very small number and one very large number, with a gap between them and the rest of the distribution. Values that fall outside of the overall pattern are called outliers. They might be interesting, they might be mistakes - I get those in my data from typos in entering RNA sequence data into the computer. They might only indicate that you need more samples. Will be paying a lot of attention to them throughout class both for what we can learn about biology and also because they can cause trouble with your statistics. Guess which states they are (florida and alaska).

    19. IMPORTANT NOTE: Your data are the way they are. Do not try to force them into a particular shape.

    20. Stemplots Use stemplots for smaller datasets. Present more detailed info than a histogram. Looks like a histogram on its side. The stemplot preserves the actual data.

    21. Line graphs: time plots

More Related